Products encoded by a methymycin/pikromycin biosynthetic gene cluster

ABSTRACT

A novel pathway for the synthesis of polyhydroxyalkanoates is provided. A method of synthesizing a recombinant polyhydroxyalkanoate monomer synthase is also provided. These recombinant polyhydroxyalkanoate synthases are derived from multifunctional fatty acid synthases or polyketide synthases and generate hydroxyacyl acids capable of polymerization by a polyhydroxyalkanoate synthase. Also provided is a biosynthetic gene cluster for methymycin and pikomycin as well as a biosynthetic gene cluster for desosamine.

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation-in-part application of U.S.application Serial No. 60/008,847, filed Dec. 19, 1995, the disclosureof which is incorporated by reference herein.

STATEMENT OF GOVERNMENT RIGHTS

[0002] This invention was made with a grant from the Government of theUnited States of America (grants GM48562, GM35906 and GM54346 from theNational Institutes of Health and a grant from the Office of NavalResearch). The Government may have certain rights in the invention.

BACKGROUND OF THE INVENTION

[0003] Polyhydroxyalkanoates (PHAs) are one class of biodegradablepolymers. The first identified member of the PHAs thermoplastics waspolyhydroxybutyrate (PHB), the polymeric ester ofD(−)-3-hydroxybutyrate. The biosynthetic pathway of PHB in the gramnegative bacterium Alcaligenes eutrophus is depicted in FIG. 1. PHAsrelated to PHB differ in the structure of the pendant arm, R (FIG. 2).For example, R=CH₃ in PHB, while R=CH₂CH₃ in polyhydroxyvalerate, andR=(CH₂)₄CH₃ in polyhydroxyoctanoate.

[0004] The genes responsible for PHB synthesis in A. eutrophus have beencloned and sequenced. (Peoples et al., J. Biol. Chem., 264, 15293(1989); Peoples et al., J. Biol. Chem., 264, 15298 (1989)). Threeenzymes: β-ketothiolase (phbA), acetoacetyl-CoA reductase (phbB), andPHB synthase (phbC) are involved in the conversion of acetyl-CoA to PHB.The PHB synthase gene encodes a protein of M_(r)=63,900 which is activewhen introduced into E. coli (Peoples et al., J. Biol. Chem., 264, 15298(1989)).

[0005] Although PHB represents the archetypical form of a biodegradablethermoplastic, its physical properties preclude significant use of thehomopolymer form. Pure PHB is highly crystalline and, thus, verybrittle. However, unique physical properties resulting form thestructural characteristics of the R groups in a PHA copolymer may resultin a polymer with more desirable characteristics. These characteristicsinclude altered crystallinity, UV weathering resistance, glass to rubbertransition temperature (T_(g)), melting temperature of the crystallinephase, rigidity and durability (Holmes et al., EPO 00052 459; Andersonet al., Microbiol. Rev., 54, 450 (1990)). Thus, these polyesters behaveas thermoplastics, with melting temperatures of 50-180° C., which can beprocessed by conventional extension and molding equipment.

[0006] Traditional strategies for producing random PHA copolymersinvolve feeding short- and long-chain fatty acid monomers to bacterialcultures. However, this technology is limited by the monomer units whichcan be incorporated into a polymer by the endogenous PHA synthase andthe expense of manufacturing PHAs by existing fermentation methods(Haywood et al., FEMS Microbiol. Lett., 57, 1 (1989); Poi et al., Int.J. Biol. Macromol., 12, 106 (1990); Steinbuchel et al., In: NovelBiomaterials from Biological Sources. D. Byron (ed.), MacMillan, NY(1991); Valentin et al., Appl. Microbiol. Biotechnical, 36, 507 (1992)).

[0007] The production of diverse hydroxyacylCoA monomers for homo- andco-polymeric PHAs also occurs in some bacteria through the reduction andcondensation pathway of fatty acids. This pathway employs a fatty acidsynthase (FAS) which condenses malonate and acetate. The resultingβ-keto group undergoes three processing steps, β-keto reduction,dehydration, and enoyl reduction, to yield a fully saturated butyrylunit. However, this pathway provides only a limited array of PHAmonomers which vary in alkyl chain length but not in the degree of alkylgroup branching, saturation, or functionalization along the acyl chain.

[0008] The biosynthesis of polyketides, such as erythromycin, ismechanistically related to formation of long-chain fatty acids. However,polyketides, in contrast to FASs, retain ketone, hydroxyl, or olefinicfunctions and contain methyl or ethyl side groups interspersed along anacyl chain comparable in length to that of common fatty acids. Thisasymmetry in structure implies that the polyketide synthase (PKS), theenzyme system responsible for formation of these molecules, althoughmechanistically related to a FAS, results in an end product that isstructurally very different than that of a long-chain fatty acid.

[0009] Because PHAs are biodegradable polymers that have the versatilityto replace petrochemical-based thermoplastics, it is desirable that new,more economical methods be provided for the production of defined PHAs.Thus, what is needed are methods to produce recombinant PHA monomersynthases for the generation of PHA polymers.

SUMMARY OF THE INVENTION

[0010] The present invention provides a method of preparing apolyhydroxyalkanoate synthase. The method comprises introducing anexpression cassette into a non-plant eukaryotic cell. The expressioncassette comprises a DNA molecule encoding a polyhydroxyalkanoatesynthase, e.g., a polyhydroxybutyrate synthase, operably linked to apromoter functional in the non-plant eukaryotic cell. The DNA moleculemay be obtained from a bacterium such as Alcaligenes eutrophus. The DNAmolecule encoding the polyhydroxyalkanoate synthase is then expressed inthe cell. Thus, another embodiment of the invention provides a purifiedrecombinant polyhydroxybutyrate synthase isolated from a host cell whichexpresses the synthase.

[0011] Another embodiment of the invention is a method of preparing apolyhydroxyalkanoate polymer. The method comprises introducing a firstexpression cassette and a second expression cassette into a eukaryoticcell. The first expression cassette comprises a DNA segment encoding afatty acid synthase in which the dehydrase activity has been inactivatedthat is operably linked to a promoter functional in the eukaryotic cell,e.g., an insect cell. The inactivation preferably is via a mutation inthe catalytic site of the dehydrase. The second expression cassettecomprises a DNA segment encoding a polyhydroxyalkanoate synthaseoperably linked to a promoter functional in the eukaryotic cell. Theexpression cassettes may be on the same or separate molecules. The DNAsegments in the expression cassettes are expressed in the cell so as toyield a polyhydroxyalkanoate polymer.

[0012] Another embodiment of the invention is a baculovirus expressioncassette comprising a nucleic acid molecule encoding apolyhydroxyalkanoate synthase operably linked to a promoter functionalin an insect cell. Preferably, the nucleic acid molecule is obtainedfrom a bacterium, e.g., Alcaligenes eutrophus.

[0013] The present invention also provides an expression cassettecomprising a nucleic acid molecule encoding a polyhydroxyalkanoatemonomer synthase operably linked to a promoter functional in a hostcell. The nucleic acid molecule comprises a plurality of DNA segments.Thus, the nucleic acid molecule comprises at least a first and a secondDNA segment. No more than one DNA segment is derived from the eryA genecluster of Saccharopolyspora erythraea. The first DNA segment encodes afirst module and the second DNA segment encodes a second module, whereinthe DNA segments together encode a polyhydroxyalkanoate monomersynthase. The source of at least one DNA segment is preferably bacterialDNA. It is preferred that the first DNA segment encodes the first moduleform the vep gene cluster and the second DNA segment encodes module 7from the tyl P gene cluster. The nucleic acid molecule may optionallyfurther comprise a third DNA segment encoding a polyhydroxyalkanoatesynthase. Alternatively, a second nucleic acid molecule encoding apolyhydroxyalkanoate synthase may be introduced into the host cell.

[0014] Also provided is an isolated and purified DNA molecule. The DNAmolecule comprises a plurality of DNA segments. Thus, the DNA moleculecomprises at least a first and a second DNA segment. The first DNAsegment encodes a first module and the second DNA segment encodes asecond module. No more than one DNA segment is derived from the eryAgene cluster of Saccharopolyspora erythraea. Also, it is preferred thatno more than one module is derived from the gene cluster fromStreptomyces hygroscopicus that encodes rapamycin or the gene clusterthat encodes spiramycin. Together the DNA segments encode a recombinantpolyhydroxyalkanoate monomer synthase. A preferred embodiment of theinvention employs a first DNA segment derived from the vep gene clusterof Streptomyces. Another preferred embodiment of the invention employs asecond DNA segment derived from the yl gene cluster of Streptomyces. Afurther preferred embodiment of the isolated DNA molecule of theinvention includes a DNA segment encoding a polyhydroxyalkanoatesynthase.

[0015] Yet another preferred embodiment is an isolated DNA molecule ofthe invention wherein the second DNA segment comprises a DNA encoding athioesterase which is located at the 3′ end of the second DNA segment.More preferably, the second DNA segment comprises a DNA encoding an acylcarrier protein which is located 5′ to the DNA encoding thethioesterase. Even more preferably, the second DNA segment comprises aDNA encoding a linker region, wherein the DNA encoding the linker regionis located between the DNA encoding the acyl carrier protein and the DNAencoding the thioesterase.

[0016] Another embodiment of the isolated DNA molecule of the inventioncomprises a first DNA segment comprising DNA encoding two acyltransferases, wherein the DNA encoding the first acyl transferase is 5′to the DNA encoding the second acyl transferase. Preferably, the secondacyl transferase adds acyl groups to malonylCoA.

[0017] Other embodiments of the isolated DNA molecule include a firstDNA segment comprising a DNA encoding a dehydrase, a first DNA segmentcomprising a DNA encoding a dehydrase and an enoyl reductase, a secondDNA segment comprising a DNA encoding an inactive dehydrase, or a firstDNA segment comprising a DNA encoding an acyl transferase. A preferredacyl transferase binds an acyl CoA substrate.

[0018] A further embodiment of the isolated DNA molecule includes afirst DNA segment encoding a first module and a second DNA segmentencoding a second module, wherein the DNA segments together encode arecombinant polyhydroxyalkanoate monomer synthase, and wherein no morethan one DNA segment is derived from the eryA gene cluster ofSaccharopolyspora erythraea. Also preferably, at least one DNA segmentis derived from the vep gene cluster or the tyl gene cluster. In onepreferred embodiment, the first DNA segment encodes the first modulefrom the vep gene cluster and the second DNA segment encodes module 7from the tyl gene cluster.

[0019] Yet another embodiment of the invention is a method of providinga polyhydroxyalkanoate monomer. The method comprises introducing a DNAmolecule into a host cell. The DNA molecule comprises a DNA segmentencoding a recombinant polyhydroxyalkanoate monomer synthase operablylinked to a promoter functional in the host cell. The DNA encoding therecombinant polyhydroxyalkanoate monomer synthase, which synthasecomprises at least a first module and a second module, is expressed inthe host cell so as to generate a polyhydroxyalkanoate monomer.Preferably, the first DNA segment encodes the first module from the vepgene cluster and the second DNA segment encodes module 7 from the tyl Pgene cluster. Also preferably, the DNA molecule further comprises a DNAsegment encoding a polyhydroxyalkanoate synthase.

[0020] Also provided is a method of preparing a polyhydroxyalkanoatepolymer. The method comprises introducing a first DNA molecule and asecond DNA molecule into a host cell. The first DNA molecule comprises aDNA segment encoding a recombinant polyhydroxyalkanoate monomersynthase. The recombinant polyhydroxyalkanoate monomer synthasecomprises a plurality of modules. Thus, the monomer synthase comprisesat least a first module and a second module. The first DNA molecule isoperably linked to a promoter functional in a host cell. The second DNAmolecule comprises a DNA segment encoding a polyhydroxyalkanoatesynthase operably linked to a promoter functional in the host cell. TheDNAs encoding the recombinant polyhydroxyalkanoate monomer synthase andpolyhydroxyalkanoate synthase are expressed in the host cell so as togenerate a polyhydroxyalkanoate polymer.

[0021] Yet another embodiment of the invention is an isolated andpurified DNA molecule. The DNA molecule comprises a plurality of DNAsegments. That is, the DNA molecule comprises at least a first and asecond DNA segment. The first DNA segment encodes a fatty acid synthaseand the second DNA segment encodes a module of a polyketide synthase. Apreferred embodiment of the invention employs a second DNA segmentencoding a module which comprises a β-ketoacyl synthase amino-terminalto an acyltransferase which is amino-terminal to a ketoreductase whichis amino-terminal to an acyl carrier protein which is amino-terminal toa thioesterase. Other preferred embodiments of the invention include asecond DNA segment that is 3′ to the DNA encoding the fatty acidsynthase, a first DNA segment encoding a fatty acid synthase and asecond DNA segment encoding a module of a polyketide synthase, or asecond DNA segment that is separated from the first DNA segment by a DNAencoding a linker region. Preferred linker regions include the linkerregion from tyl ORF1 ACP₁-KS₂, tyl ORF1 ACP₂-KS₃, tyl ORF3 ACP₅-KS₆,eryA ORF1 ACP₁-KS₁, eryA ORF1 ACP₂-KS₂, eryA ORF2 ACP₃-KS₄, and eryAORF2 ACP₅-KS₆.

[0022] The invention also provides a method of preparing apolyhydroxyalkanoate monomer. The method comprises introducing a DNAmolecule comprising a plurality of DNA segments into a host cell, e.g.,an insect cell, a Streptomyces cell or a Pseudomonas cell. Thus, the DNAmolecule comprises at least a first and a second DNA segment. The firstDNA segment encodes a fatty acid synthase operably linked to a promoterfunctional in the host cell. Preferably, the fatty acid synthase iseukaryotic in origin. Alternatively, the fatty acid synthase isprokaryotic in origin. The second DNA segment encodes a polyketidesynthase. Preferably, the second DNA segment encodes the tyl module F.The second DNA segment is located 3′ to the first DNA segment. The firstDNA segment is linked to the second DNA segment so that the encodedprotein is expressed as a fusion protein. The DNA molecule is thenexpressed in the host cell so as to generate a polyhydroxyalkanoatemonomer.

[0023] Another embodiment of the invention is an expression cassettecomprising a DNA molecule comprising a DNA segment encoding a fatty acidsynthase and a polyhydroxyalkanoate synthase.

[0024] Also provided is a method of providing a polyhydroxyalkanoatemonomer synthase. The method comprises introducing an expressioncassette into a host cell. The expression cassette comprises a DNAmolecule encoding a polyhydroxyalkanoate monomer synthase operablylinked to a promoter functional in the host cell. The monomer synthasecomprises a plurality of modules. Thus, the monomer synthase comprisesat least a first and second module which together encode the monomersynthase. Optionally, the expression cassette further comprises a secondDNA molecule encoding a polyhydroxyalkanoate synthase.

[0025] A further embodiment of the invention is an isolated and purifiedDNA molecule comprising a DNA segment which encodes a Streptomycesvenezuelae polyketide synthase, e.g., a polyhydroxyalkanoate monomersynthase, a biologically active variant or subunit (fragment) thereof.Preferably, the DNA segment encodes a polypeptide having an amino acidsequence comprising SEQ ID NO:2. Preferably, the DNA segment comprisesSEQ ID NO:1. The DNA molecules of the invention are double stranded orsingle stranded. A preferred embodiment of the invention is a DNAmolecule that has at least about 70%, more preferably at least about80%, and even more preferably at least about 90%, but less than 100%,contiguous sequence identity to the DNA segment comprising SEQ ID NO:1,e.g., a “variant” DNA molecule. A variant DNA molecule of the inventioncan be prepared by methods well known to the art, includingoligonucleotide-mediated mutagenesis. See Adelman et al., DNA, 2, 183(1983) and Sambrook et al., Molecular Cloning: A Laboratory Manual(1989).

[0026] The invention also provides an isolated, purifiedpolyhydroxyalkanoate monomer synthase, e.g., a polypeptide having anamino acid sequence comprising SEQ ID NO:2, a biologically activesubunit, or a biologically active variant thereof. Thus, the inventionprovides a variant polypeptide having at least about 80%, morepreferably at least about 90%, and even more preferably at least about95%, but less than 100%, contiguous amino acid sequence identity to thepolypeptide having an amino acid sequence comprising SEQ ID NO:2. Apreferred variant polypeptide, or a subunit of a polypeptide, of theinvention includes a variant or subunit polypeptide having at leastabout 10%, more preferably at least about 50%, and even more preferablyat least about 90%, the activity of the polypeptide having the aminoacid sequence comprising SEQ ID NO:2. Preferably, a variant polypeptideof the invention has one or more conservative amino acid substitutionsrelative to the polypeptide having the amino acid sequence comprisingSEQ ID NO:2. For example, conservative substitutions includeaspartic-glutamic as acidic amino acids; lysine/arginine/histidine asbasic amino acids; leucine/isoleucine, methionine/valine, alanine/valineas hydrophobic amino acids; serine/glycine/alanine/threonine ashydrophilic amino acids. The biological activity of a polypeptide of theinvention can be measured by methods well known to the art, includingbut not limited to, methods described hereinbelow.

[0027] The invention also provides an isolated and purified nucleic acidsegment comprising a nucleic acid sequence comprising a sugar(desosamine) biosynthetic gene cluster, a biologically active variant orfragment thereof, wherein the nucleic acid sequence is not derived fromthe eryC gene cluster of Saccharopolyspora erythraea. As describedhereinbelow, the desosamine biosynthetic gene cluster from Streptomycyesvenezuelae was isolated, cloned and sequenced. The isolated nucleic acidsegment comprising the gene cluster preferably includes a nucleic acidsequence comprising SEQ ID NO:3, or a fragment or variant thereof. Thecluster was found to encode nine polypeptides including DesI (e.g., SEQID NO:8 encoded by SEQ ID NO:7), DesII (e.g., SEQ ID NO:10 encoded bySEQ ID NO:9), DesIII (e.g., SEQ ID NO:12 encoded by SEQ ID NO:11), DesIV(e.g., SEQ ID NO:14 encoded by SEQ ID NO:13), DesV (e.g., SEQ ID NO:16encoded by SEQ ID NO:15), DesVI (e.g., SEQ ID NO:18 encoded by SEQ IDNO:17), DesVII (e.g., SEQ ID NO:20 encoded by SEQ ID NO:19), DesVIII(e.g., SEQ ID NO:22 encoded by SEQ ID NO:21), and DesR (e.g., SEQ IDNO:24 encoded by SEQ ID NO:23) (see FIG. 24). It is also preferred thatthe nucleic acid segment of the invention encoding DesR is not derivedfrom the eryB gene cluster of Saccharopolyspora erythraea or the oleDgene from Streptomyces antibioticus.

[0028] The invention also provides a variant polypeptide having at leastabout 80%, more preferably at least about 90%, and even more preferablyat least about 95%, but less than 100%, contiguous amino acid sequenceidentity to the polypeptide having an amino acid sequence comprising SEQID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ IDNO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, or a fragment thereof.A preferred variant polypeptide, or a subunit or fragment of apolypeptide, of the invention includes a variant or subunit polypeptidehaving at least about 1%, more preferably at least about 10%, and evenmore preferably at least about 50%, the activity of the polypeptidehaving the amino acid sequence comprising SEQ ID NO:8, SEQ ID NO:10, SEQID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ IDNO:22, or SEQ ID NO:24. Thus, for example, the glycosyltransferaseactivity of a polypeptide of SEQ ID NO:20 can be compared to a variantof SEQ ID NO:20 having at least one amino acid substitution, insertion,or deletion relative to SEQ ID NO:20.

[0029] A variant nucleic acid sequence of the invention has at leastabout 80%, more preferably at least about 90%, and even more preferablyat least about 95%, but less than 100%, contiguous nucleic acid sequenceidentity to a nucleic acid sequence comprising SEQ ID NO:3, SEQ ID NO:7,SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQID NO:19, SEQ ID NO:21, SEQ ID NO:23, or a fragment thereof.

[0030] Also provided is an expression cassette comprising a nucleic acidsequence comprising a desosamine biosynthetic gene cluster, abiologically active variant or fragment thereof operably linked to apromoter functional in a host cell, as well as host cells comprising anexpression cassette of the invention. Thus, the expression cassettes ofthe invention are useful to express individual genes within the cluster,e.g., the desR gene which encodes a glycosidase or the desVII gene whichencodes a glycosyltransferase having relaxed substrate specificity forpolyketides and deoxysugars, i.e., the glycosyltransferase processessugar substrates other than TDP-desosamine. Thus, the desVII gene can beemployed in combinatorial biology approaches to synthesize a library ofmacrolide compounds having various polyketide and deoxysugar structures.Moreover, the expression of a glycosylase in a host cell whichsynthesizes a macrolide antibiotic may be useful in a method to reducetoxicity of, e.g., inactivate, the antibiotic. For example, a host cellwhich produces the antibiotic is transformed with an expression cassetteencoding the glycosyltransferase. The recombinant glycosyltransferase isexpressed in an amount that reversibly inactivates the antibiotic. Toactivate the antibiotic, the antibiotic, preferably the isolatedantibiotic which is recovered from the host cell, is contacted with anappropriate native or recombinant glycosidase.

[0031] Preferably, the nucleic acid segment encoding desosamine in theexpression cassette of the invention is not derived form the eryC genecluster of Saccharopolyspora erythraea. Preferred host cells areprokaryotic cells, although eukaryotic host cells are also envisioned.These host cells are useful to express desosamine, analogs orderivatives thereof. Also provided is an expression cassette or hostcell comprising antisense sequences from at least a portion of thedesosamine biosynthetic gene cluster.

[0032] Another embodiment of the invention is a recombinant host cell,e.g., a bacterial cell, in which a portion of a nucleic acid sequenceencoding desosamine in the host chromosome is disrupted, e.g., deletedor interrupted (e.g., by an insertion) with heterologous sequences, orsubstituted with a variant nucleic acid sequence of the invention,preferably so as to result in a decrease or lack of desosaminesynthesis, and/or so as to result in the synthesis of an analog orderivative of desosamine. Preferably, the nucleic acid sequence which isdisrupted is not derived from the eryC gene cluster of Saccharopolysporaerythraea. Thus, the recombinant host cell of the invention has at leastone gene, i.e., desI, desII, desIII, desIV, desV, desVI, desVII, desVIIIor desR, which is disrupted. One embodiment of the invention includes arecombinant host cell in which the desVI gene, which encodes anN-methyltransferase, is disrupted, for example, by replacement with anantibiotic resistance gene. Preferably, such a host cell produces anaglycone having an N-acetylated aminodeoxy sugar, 10-deoxy-methylonide,a compound of formula (7), a compound of formula (8), or a combinationthereof Thus, the deletion or disruption of the desVI gene may be usefulin a method for preparing novel sugars.

[0033] Another preferred embodiment of the invention is a recombinantbacterial host cell in which the desR gene, which encodes a glycosidasesuch as β-glucosidase, is disrupted. Preferably, the host cellsynthesizes C-2′ β-glucosylated macrolide antibiotics, for example, acompound of formula (13), a compound of formula (14), or a combinationthereof. Therefore, the invention further provides a compound of formula(8), (9), (13) or (14). It will be appreciated by those skilled in theart that each atom of the compounds of the invention having a chiralcenter may exist in and be isolated in optically active and racemicforms. Some compounds may exhibit polymorphism. It is to be understoodthat the present invention encompasses any racemic, optically active,polymorphic or stereoisomeric form, or mixtures thereof, of a compoundof the invention, which possess the useful properties described herein,it being well known in the art how to prepare optically active forms(for example, by resolution of the racemic form by recrystallizationtechniques, by synthesis from optically active starting materials, bychiral synthesis, or by chromatographic separation using a chiralstationary phase) and how to determine activity using the standard testsdescribed herein, or using other similar tests which are well known inthe art.

[0034] Further provided is an isolated and purified nucleic acid segmentcomprising a nucleic acid sequence comprising a macrolide biosyntheticgene cluster (the “met/pik” or “pik” gene cluster) encoding methymycin,pikomycin, neomethymycin, narbomycin, or a combination thereof, or abiologically active variant or fragment thereof It is preferred that thenucleic acid segment comprises SEQ ID NO:5, or a fragment or variantthereof. It is also preferred that the isolated and purified nucleicacid segment is from Streptomyces sp., such as Streptomyces venezuelae(e.g., ATCC 15439, MCRL 0306, SC 2366 or 3629), Streptomycesnarbonensis, Streptomyces eurocidicus, Streptomyces zaomyceticus (MCRL0405), Streptomyces flavochromogens, Streptomyces sp. AM400, andStreptomyces felleus, although isolated and purified nucleic acid fromother organisms which produce methymycin, narbomycin, neomethymycinand/or pikomycin are also within the scope of the invention. The clonedgenes can be introduced into an expression system and geneticallymanipulated so as to yield novel macrolide antibiotics, e.g., ketolides,as well as monomers for polyhydroxyalkanoate (PHA) biopolymers.Preferably, the nucleic acid sequence encodes PikR1 (e.g., SEQ ID NO:27encoded by SEQ ID NO:26), PikR2 (e.g., SEQ ID NO:29 encoded by SEQ IDNO:28), PikAI (e.g., SEQ ID NO:31 encoded by SEQ ID NO:30), PikAII(e.g., SEQ ID NO:33 encoded by SEQ ID NO:32), PikAIII (e.g., SEQ IDNO:35 encoded by SEQ ID NO:34), PikAIV (e.g., SEQ ID NO:37 encoded bySEQ ID NO:36), PikB (which is the desosamine gene cluster describedabove), PikC (e.g., SEQ ID NO:39 encoded by SEQ ID NO:38), and PikD(e.g., SEQ ID NO:41 encoded by SEQ ID NO:40), a variant or a fragmentthereof.

[0035] The invention also provides a variant polypeptide having at leastabout 80%, more preferably at least about 90%, and even more preferablyat least about 95%, but less than 100%, contiguous amino acid sequenceidentity to the polypeptide having an amino acid sequence comprising SEQID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ IDNO:37, SEQ ID NO:39, SEQ ID NO:41, or a fragment thereof. A preferredvariant polypeptide, or a subunit of a polypeptide, of the inventionincludes a variant or subunit polypeptide having at least about 1%, morepreferably at least about 10%, and even more preferably at least about50%, the activity of the polypeptide having the amino acid sequencecomprising SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQID NO:35, SEQ ID NO:37, SEQ ID NO:39, or SEQ ID NO:41. The activities ofpolypeptides of the macrolide biosynthetic pathway of the invention aredescribed below.

[0036] A variant nucleic acid sequence of the pik biosynthetic genecluster of the invention has at least about 80%, more preferably atleast about 90%, and even more preferably at least about 95%, but lessthan 100%, contiguous nucleic acid sequence identity to a nucleic acidsequence comprising SEQ ID NO:5, SEQ ID NO:26, SEQ ID NO:28, SEQ IDNO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ IDNO:40, or a fragment thereof.

[0037] The pikA gene encodes a polyketide synthase which synthesizesmacrolactone 10-deoxymethonolide and narbolide, pikB encodes desosaminesynthases which catalyze the formation and transfer of a deoxysugarmoiety onto aglycones, the pikC gene encodes a P450 hydoxylase whichcatalyzes the conversion of YC-17 and narbomycin into methymycin,neomethymycin, and pikromycin, and the pikR1, pikR2 (possibly one for a12-membered ring and the other for a 14-membered ring) and desR geneswhich encode enzymes associated with bacterial self-protection. Thus,the isolated nucleic acid molecule of the invention encodes four activemacrolide antibiotics two of which have a 12-membered ring while theother two have a 14-membered ring. The regulation of the synthesis of12- or 14-membered rings may be the result of the sequences in thespacer region between modules 5 and 6, as discussed below. Thus, thegenetic mechanism underlying the alternative termination of polyketidesynthesis may be useful to prepare novel antibiotics and PHA monomers.

[0038] The invention further provides isolated and purified nucleic acidsegments, e.g., in the form of an expression cassette, for each of theindividual genes in the macrolide biosynthetic gene cluster. Forexample, the invention provides an isolated and purified pikAV gene thatencodes a thioesterase II. In particular, the thioesterase is useful toenhance the structural diversity of antibiotics and in PHA production,as the thioesterase modulates chain release and cyclization. Forexample, a thioesterase II gene having acyl-ACP coenzyme A transferaseactivity (e.g., a mutant pik TEII, bacterial, fungal or plantmedium-chain-length thioesterase, an animal fatty acid thioesterase or athioesterase from a polyketide synthase) is introduced at the end of arecombinant monomer synthase (see FIG. 36), which, in the presence of aPHA synthase, e.g., phaC1, produces a novel polyhydroxyalkanoatepolymer. Alternatively, in the absence of a TEII domain, a fusion of aportion of PKS gene cluster with a PHA synthase may result in thetransfer of an acyl chain from the PHA to the polymerase.

[0039] Also provided is a pikC gene that encodes a hydroxylase which isactive at two positions on a 12-membered ring or at one position on a14-membered ring. Such a gene may be particularly useful to preparenovel compounds through bioconversion or biotransformation.

[0040] The invention also provides an expression cassette comprising anucleic acid segment comprising a macrolide biosynthetic gene clusterencoding methymycin, pikomycin, neomethymycin, narbomycin, or acombination thereof, or a biologically active variant or fragmentthereof, operably linked to a promoter functional in a host cell.Further provided is a host cell comprising the nucleic acid segmentencoding methymycin, pikomycin, neomethymycin, narbomycin, or acombination thereof, or a biologically active variant or fragmentthereof Moreover, the invention provides isolated and purifiedpolypeptides of the invention, preferably obtained from host cellshaving the nucleic acid molecules of the invention. In addition,expression cassettes and host cells comprising antisense sequences of atleast a portion of the macrolide biosynthetic gene cluster of theinvention are envisioned.

[0041] Yet another embodiment of the invention is a recombinant hostcell, e.g., a bacterial cell, in which a portion of the macrolidebiosynthetic gene cluster of the invention is disrupted or replaced witha heterologous sequence or a variant nucleic acid segment of theinvention, preferably so as to result in a decrease or lack ofmethymycin, pikomycin, neomethymycin, narbomycin, or a combinationthereof, and/or so as to result in the synthesis of novel macrolides.Therefore, the invention provides a recombinant host cell in which apikAI gene, a pikAII gene, a pikAIII gene (12-membered rings), a pikIVgene (14-membered rings), a pikB gene cluster, a pikAV gene, a pikCgene, a pikD gene, a pikR1 gene, a pikR2 gene, or a combination thereof,is disrupted or replaced. A preferred embodiment of the invention is ahost cell wherein the pikB (e.g., the desVI and desV genes), pikA1,pikAV or pikC gene, is disrupted.

[0042] Moreover, as the nucleic acid segment comprising the macrolidebiosynthetic gene cluster of the invention encodes a polyketidesynthase, modules of that synthase are useful in methods to preparerecombinant polyhydroxyalkanoate monomer synthases and polymers inaddition to macrolide antibiotics and derivatives thereof.

[0043] Thus, the invention provides an isolated and purified DNAmolecule comprising a first DNA segment encoding a first module and asecond DNA segment encoding a second module, wherein the DNA segmentstogether encode a recombinant polyhydroxyalkanoate monomer synthase, andwherein at least one DNA segment is derived from the pikA gene clusterof Streptomyces venezuelae. Preferably, no more than one DNA segment isderived from the eryA gene cluster of Saccharopolyspora erythraea. Inone embodiment of the invention, the 3′ most DNA segment of the isolatedDNA molecule of the invention encodes a thioesterase II. Also providedis an expression cassette comprising a nucleic acid molecule encodingthe polyhydroxyalkanoate monomer synthase operably linked to a promoterfunctional in a host cell.

[0044] Yet another embodiment of the invention is a method of providinga polyhydroxyalkanoate monomer. The method comprises introducing into ahost cell a DNA molecule comprising a DNA segment encoding a recombinantpolyhydroxyalkanoate monomer synthase operably linked to a promoterfunctional in the host cell. The recombinant polyhydroxyalkanoatemonomer synthase comprises a first module and a second module, whereinat least one DNA segment is derived from the pikA gene cluster ofStreptomyces venezuelae. The DNA encoding the recombinantpolyhydroxyalkanoate monomer synthase is then expressed in the host cellso as to generate a polyhydroxyalkanoate monomer. Optionally, a a secondDNA molecule may be introduced into the host cell. The second DNAmolecule comprises a DNA segment encoding a polyhydroxyalkanoatesynthase operably linked to a promoter functional in the host cell. Thetwo DNA molecules are expressed in the host cell so as to generate apolyhydroxyalkanoate polymer.

[0045] Another embodiment of the invention is an isolated and purifiedDNA molecule comprising a first DNA segment encoding a fatty acidsynthase and a second DNA segment encoding a module from the pikA genecluster of Streptomyces venezuelae. Such a DNA molecule can be employedin a method of providing a polyhydroxyalkanoate monomer. Thus, a DNAmolecule comprising a first DNA segment encoding a fatty acid synthaseand a second DNA segment encoding a polyketide synthase is introducedinto a host cell. The first DNA segment is 5′ to the second DNA segmentand the first DNA segment is operably linked to a promoter functional inthe host cell. The first DNA segment is linked to the second DNA segmentso that the linked DNA segments express a fusion protein. The DNAmolecule is expressed in the host cell so as to generate apolyhydroxyalkanoate monomer.

[0046] Further provided is a method of providing a polyhydroxyalkanoatemonomer synthase. The method comprises introducing an expressioncassette comprising a DNA molecule encoding a polyhydroxyalkanoatesynthase operably linked to a promoter functional in a host cell. TheDNA molecule comprises a first DNA segment encoding a first module and asecond DNA segment encoding a second module wherein the DNA segmentstogether encode a polyhydroxyalkanoate monomer synthase. At least oneDNA segment is derived from the pikA gene cluster of Streptomycesvenezuelae. The DNA molecule is expressed in the host cell. Optionally,the DNA molecule further comprises a DNA segment encoding apolyhydroxyalkanoate synthase. Alternatively, a second, separate DNAmolecule encoding a polyhydroxyalkanoate synthase is introduced into thehost cell.

[0047] Also provided is a method for directing the biosynthesis ofspecific glycosylation-modified polyketides by genetic manipulation of apolyketide-producing microorganism. The method comprises introducinginto a polyketide-producing microorganism a DNA sequence encodingenzymes in desosamine biosynthesis, e.g., a DNA sequence comprising SEQID NO:3, a variant or fragment thereof, so as to yield a microorganismthat produces specific glycosylation-modified polyketides.Alternatively, an anti-sense DNA sequence of the invention may beemployed. Then the glycosylation-modified polyketides are isolated fromthe microorganism. It is preferred that the DNA sequence is modified soas to result in the inactivation of at least one enzymatic activity insugar biosynthesis or in the attachment of the sugar to a polyketide.

[0048] Thus, the modules encoded by the nucleic acid segments of theinvention may be employed in the methods described hereinabove toprepare polyhydroxyalkanoates of varied chain length or having variousside chain substitutions and/or to prepare glycosylated biopolymers.Therefore, the compounds produced by the recombinant host cells of theinvention are useful as biopolymers, e.g., in packaging or biomedicalapplications, or to engineer PHA monomer synthases; pharmaceuticals suchas chemotherapeutic agents, immunosuppressants, agents to treat asthma,chronic obstructive pulmonary disease as well as other diseasesinvolving respiratory inflammation, cholesterol-lowering agents, ormacrolide-based antibiotics which are active against a variety oforganisms, e.g., bacteria, including multi-drug-resistant pneumococciand other respiratory pathogens, as well as viral and parasiticpathogens; or as crop protection agents (e.g., fungicides orinsecticides) via expression of polyketides in plants. Methods employingthese compounds, e.g., to treat a mammal, bird or fish in need of suchtherapy, such as a patient having a bacterial infection, are alsoenvisioned.

[0049] As used herein, a “linker region” is an amino acid sequencepresent in a multifunctional protein which is less well conserved in anamino acid sequence than an amino acid sequence with catalytic activity.

[0050] As used herein, an “extender unit” catalytic or enzymatic domainis an acyl transferase in a module that catalyzes chain elongation byadding 2-4 carbon units to an acyl chain and is located carboxy-terminalto another acyl transferase. For example, an extender unit withmethylmalonylCoA specificity adds acyl groups to a methylmalonylCoAmolecule.

[0051] As used herein, a “polyhydroxyalkanoate” or “PHA” polymerincludes, but is not limited to, linked units of related, preferablyheterologous, hydroxyalkanoates such as 3-hydroxybutyrate,3-hydroxyvalerate, 3-hydroxycaproate, 3-hydroxyheptanoate,3-hydroxyhexanoate, 3-hydroxyoctanoate, 3-hydroxyundecanoate, and3-hydroxydodecanoate, and their 4-hydroxy and 5-hydroxy counterparts.

[0052] As used herein, a “Type I polyketide synthase” is a singlepolypeptide with a single set of iteratively used active sites. This isin contrast to a Type II polyketide synthase which employs active siteson a series of polypeptides.

[0053] As used herein, a “recombinant” nucleic acid or protein moleculeis a molecule where the nucleic acid molecule which encodes the proteinhas been modified in vitro, so that its sequence is not naturallyoccurring, or corresponds to naturally occurring sequences that are notpositioned as they would be positioned in a genome which has not beenmodified.

[0054] A “recombinant” host cell of the invention has a genome that hasbeen manipulated in vitro so as to alter, e.g., decrease or disrupt, or,alternatively, increase, the function or activity of at least one genein the macrolide or desosamine biosynthetic gene cluster of theinvention.

[0055] As used herein, a “multifunctional protein” is one where two ormore enzymatic activities are present on a single polypeptide.

[0056] As used herein, a “module” is one of a series of repeated unitsin a multifunctional protein, such as a Type I polyketide synthase or afatty acid synthase.

[0057] As used herein, a “premature termination product” is a productwhich is produced by a recombinant multifunctional protein which isdifferent than the product produced by the non-recombinantmultifunctional protein. In general, the product produced by therecombinant multifunctional protein has fewer acyl groups.

[0058] As used herein, a DNA that is “derived from” a gene cluster is aDNA that has been isolated and purified in vitro from genomic DNA, orsynthetically prepared on the basis of the sequence of genomic DNA.

[0059] As used herein, the pik gene cluster includes sequences encodinga polyketide synthase (pikA), desosamine biosynthetic enzymes (pikB,also referred to as des), a cytochrome P450 (pikC), regulatory factors(pikD) and enzymes for cellular self-resistance (pikR).

[0060] As used herein, the terms “isolated and/or purified” refer to invitro isolation of a DNA or polypeptide molecule from its naturalcellular environment, and from association with other components of thecell, such as nucleic acid or polypeptide, so that is can be sequenced,replicated and/or expressed. Moreover, the DNA may encode more than onerecombinant Type I polyketide synthase and/or fatty acid synthase. Forexample, “an isolated DNA molecule encoding a polyhydroxyalkanoatemonomer synthase” is RNA or DNA containing greater than 7, preferably15, and more preferably 20 or more sequential nucleotide bases thatencode a biologically active polypeptide, fragment, or variant thereof,that is complementary to the non-coding, or complementary to the codingstrand, of a polyhydroxyalkanoate monomer synthase RNA, or hybridizes tothe RNA or DNA encoding the polyhydroxyalkanoate monomer synthase andremains stably bound under stringent conditions, as defined by methodswell known to the art, e.g., in Sambrook et al., supra.

[0061] An “antibiotic” as used herein is a substance produced by amicroorganism which, either naturally or with limited chemicalmodification, will inhibit the growth of or kill another microorganismor eukaryotic cell.

[0062] An “antibiotic biosynthetic gene” is a nucleic acid, e.g., DNA,segment or sequence that encodes an enzymatic activity which isnecessary for an enzymatic reaction in the process of converting primarymetabolites into antibiotics.

[0063] An “antibiotic biosynthetic pathway” includes the entire set ofantibiotic biosynthetic genes necessary for the process of convertingprimary metabolites into antibiotics. These genes can be isolated bymethods well known to the art, e.g., see U.S. Pat. No. 4,935,340.

[0064] Antibiotic-producing organisms include any organism, including,but not limited to, Actinoplanes, Actinomadura, Bacillus,Cephalosporium, Micromonospora, Penicillium, Nocardia, and Streptomyces,which either produces an antibiotic or contains genes which, ifexpressed, would produce an antibiotic.

[0065] An antibiotic resistance-conferring gene is a DNA segment thatencodes an enzymatic or other activity which confers resistance to anantibiotic.

[0066] The term “polyketide” as used herein refers to a large anddiverse class of natural products, including but not limited toantibiotic, antifungal, anticancer, and anti-helminthic compounds.Antibiotics include, but are not limited to anthracyclines andmacrolides of different types (polyenes and avermectins as well asclassical macrolides such as erythromycins). Macrolides are produced by,for example, S. erytheus, S. antibioticus, S. venezuelae, S. fradiae andS. narbonensis.

[0067] The term “glycosylated polyketide” refers to any polyketide thatcontains one or more sugar residues.

[0068] The term “glycosylation-modified polyketide” refers to apolyketide having a changed glycosylation pattern or configurationrelative to that particular polyketide's unmodified or native state.

[0069] The term “polyketide-producing microorganism” as used hereinincludes any microorganism that can produce a polyketide naturally orafter being suitably engineered (i.e., genetically). Examples ofactinomycetes that naturally produce polyketides include but are notlimited to Micromonospora rosaria, Micromonospora megalomicea,Saccharopolyspora erythraea, Streptomyces antibioticus, Streptomycesalbereticuli, Streptomyces ambofaciens, Streptomyces avermitilis,Streptomyces fradiae, Streptomyces griseus, Streptomyces hydroscopicus,Streptomyces tsukulubaensis, Streptomyces mycarofasciens, Streptomycesplatenesis, Streptomyces violaceoniger, Streptomyces violaceoniger,Streptomyces thermotolerans, Streptomyces rimosus, Streptomycespeucetius, Streptomyces coelicolor, Streptomyces glaucescens,Streptomyces roseofulvus, Streptomyces cinnamonensis, Streptomycescuracoi, and Amycolatopsis mediterranei (see Hopwood, D. A. and Sherman,D. H., Annu. Rev. Genet., 24:37-66 (1990), incorporated herein byreference). Other examples of polyketide-producing microorganisms thatproduce polyketides naturally include various Actinomadura,Dactylosporangium and Nocardia strains.

[0070] The term “sugar biosynthesis genes” as used herein refers tonucleic acid sequences from organisms such as Streptomyces venezuelaethat encode sugar biosynthesis enzymes and is intended to includesequences of DNA from other polyketide-producing microorganisms whichare identical or analogous to those obtained from Streptomycesvenezuelae.

[0071] The term “sugar biosynthesis enzymes” as used herein refers topolypeptides which are involved in the biosynthesis and/or attachment ofpolyketide-associated sugars and their derivatives and intermediates.

[0072] The term “polyketide-associated sugar” refers to a sugar that isknown to attach to polyketides or that can be attached to polyketides bythe processes described herein.

[0073] The term “sugar derivative” refers to a sugar which is naturallyassociated with a polyketide but which is altered relative to theumnodified or native state, including but not limited to,N-3-α-desdimethyl D-desosamine.

[0074] The term “sugar intermediate” refers to an intermediate compoundproduced in a sugar biosynthesis pathway.

BRIEF DESCRIPTION OF THE FIGURES

[0075]FIG. 1. The PHB biosynthetic pathway in A. eutrophus.

[0076]FIG. 2. Molecular structure of common bacterial PHAs. Most of theknown PHAs are polymers of 3-hydroxy acids possessing the generalformula shown. For example, R=CH₃ in PHB, T=CH₂CH₃ inpolyhydroxyvalerate (PHV), and R=(CH₂)₄CH₃ in polyhydroxyoctanoate(PHO).

[0077]FIG. 3. Comparison of the natural and recombinant pathways for PHBsynthesis. The three enzymatic steps of PHB synthesis in bacteriainvolving 3-ketothiolase, acetoacetyl-CoA reductase, and PHB synthaseare shown on the left. The two enzymatic steps involved in PHB synthesisin the pathway in Sf21 cells containing a rat fatty acid synthase withan inactivated dehydrase domain (ratFAS206) are shown on the right.

[0078]FIG. 4. Schematic diagram of the molecular organization of the tylpolyketide synthase (PKS) gene cluster. Open arrows correspond toindividual open reading frames (ORFs) and numbers above an ORF denote amultifunctional module or synthase unit (SU). AT=acyltransferase;ACP=acyl carrier protein; KS=β-ketoacyl synthase; KR=ketoreductase;DH=dehydrase; ER=enoyl reductase; TE=thioesterase; MM=methylmalonylCoA;M=malonyl CoA; EM=ethylmalonyl CoA. Module 7 in tyl is also known asModule F.

[0079]FIG. 5. Schematic diagram of the molecular organization of the metPKS gene cluster.

[0080]FIG. 6. Strategy for producing a recombinant PHA monomer synthaseby domain replacement.

[0081]FIG. 7. (A) 10% SDS-PAGE gel showing samples from various stagesof the purification of PHA synthase; lane 1, molecular weight markers;lane 2, total protein of uninfected insect cells; lane 3, total proteinor insect cells expressing a rat FAS (200 kDa; Joshi et al., Biochem.J., 296, 143 (1993)); lane 4, total protein of insect cells expressingPHA synthase; lane 5, soluble protein from sample in lane 4; lane 6,pooled hydroxylapatite (HA) fractions containing PHA synthase. (B)Western analysis of an identical gel using rabbit-α-PHA synthaseantibody as probe. Bands designated with arrows are: a, intact PHBsynthase with N-terminal alanine at residue 7 and serine at residue 10(A7/S10); b, 44 kDa fragment of PHB synthase with N-terminal alanine atresidue 181 and asparagine at residue 185 (A181/N185); c, PHB synthasefragment of approximately 30 kDa apparently blocked based on resistanceto Edman degradation; d, 22 kDa fragment with N-terminal glycine atresidue 187 (G187). Band d apparently does not react with rabbit-α-PHBsynthase antibody (B, lane 6). The band of similar size in B, lane 4 wasnot further identified.

[0082]FIG. 8. N-terminal analysis of PHA synthase purified from insectcells. (a) The expected N-terminal 25 amino acid sequence of A.eutrophus PHA synthase. (b&c) The two N-terminal sequences determinedfor the A. eutrophus PHA synthase produced in insect cells. The boldedsequences are the actual N-termini determined.

[0083]FIG. 9. Spectrophotometric scans of substrate, 3-hydroxybutyrateCoA (HBCoA) and product, CoA. The wavelength at which the directspectrophotometric assays were carried out (232 nm) is denoted by thearrow; substrate, HBCoA () and product, CoA (∘).

[0084]FIG. 10. Velocity of the hydrolysis of HBCoA as a function ofsubstrate concentration. Assays were carried out in 40 or 200 μl assayvolumes with enzyme concentration remaining constant at 0.95 mg/ml (3.8μg/40 μl assay). Velocities were calculated from the linear portions ofthe assay curves subsequent to the characteristic lag period. Thesubstrate concentration at half-optimal velocity, the apparent K_(m)value, was estimated to be 2.5 mM from this data.

[0085]FIG. 11. Double reciprocal plot of velocity versus substrateconcentration. The concave upward shape of this plot is similar toresults obtained by Fukui et al. (Arch. Microbiol., 110, 149 (1976))with granular PHA synthase from Z. ramigera.

[0086]FIG. 12. Velocity of the hydrolysis of HBCoA as a function ofenzyme concentration. Assays were carried out in 40 μl assay volumeswith the concentration HBCoA remaining constant at 8 μM.

[0087]FIG. 13. Specific activity of PHA synthase as a function of enzymeconcentration.

[0088]FIG. 14. pH activity curve for soluble PHA synthase produced usingthe baculovirus system. Reactions were carried out in the presence of200 mM P₁. Buffers of pH<10 were prepared with potassium phosphate,while buffers of pH>10 were prepared with the appropriate proportion ofNa₃PO₄.

[0089]FIG. 15. Assays of the hydrolysis of HBCoA with varying amounts ofPHA synthase. Assays were carried out in 40 μl assay volumes with theconcentration of HBCoA remaining constant at 8 μM. Initial A₂₃₂ values,originally between 0.62 and 0.77, were normalized to 0.70. Enzymeamounts used in these assays were, from the uppermost curve, 0.38, 0.76,1.14, 1.52, 1.90, 2.28, 2.66, 3.02, 3.42, 7.6, and 15.2 μg,respectively.

[0090]FIG. 16. SDS/PAGE analysis of proteins synthesized at various timepoints during infection of Sf21 cells. Approximately 0.5 mg of totalcellular protein from various samples was fractionated on a 10%polyacrylamide gel. Samples include: uninfected cells, lanes 1-4, days0, 1, 2, 3, respectively; infection with BacPAK6::phbC alone, lanes 5-8,days, 0, 1, 2, 3, respectively, infection with baculoviral clonecontaining ratFAS206 alone, lanes 9-12, days 0, 1, 2, 3, respectively;and ratFAS206 and BacPAK6 infected cells, lanes 13-16, days 0, 1, 2, 3,respectively. A=mobility of FAS, B=mobility of PHA synthase. Molecularweight standard lanes are marked M.

[0091]FIG. 17. Gas chromatographic evidence for PHB accumulation in Sf21cells. Gas chromatograms from various samples are superimposed. PHBstandard (Sigma) is chromatogram #7 showing a propylhydroxybutyrateelution time of 10.043 minutes (s, arrow). The gas chromatograms ofextracts of the uninfected (#1); singly infected with ratFAS206 (#2, day3); and singly infected with PHA synthase (#3, day 3) are shown at thebottom of the figure. Gas chromatograms of extracts of dual-infectedcells at day 1 (#4), 2 (#5), and 3 (#6) are also shown exhibiting a peakeluting at 10.096 minutes (x, arrow). The peak of dual-infected, day 3extract (#6) was used for mass spectrometry (MS) analysis.

[0092]FIG. 18. Gas chromatography-mass spectrometry analysis of PHB. Thecharacteristic fragmentation of propylhydroxybutyrate at m/z of 43, 60,87, and 131 is shown. A) standard PHB from bacteria (Sigma), and B) peakX from ratFAS206 and BacPAK6: phbC baculovirus infected, day 3 (#6, FIG.17) Sf21 cells expressing rat FAS dehydrase inactivated protein and PHAsynthase.

[0093]FIG. 19. Map of the vep (Streptomyces venezuelae polyene encoding)gene cluster.

[0094]FIG. 20. Plasmid map of pDHS502.

[0095]FIG. 21. Plasmid map of pDHS505.

[0096]FIG. 22. Cloning protocol for pDHS505.

[0097]FIG. 23. Nucleotide sequence (SEQ ID NO:1) and corresponding aminoacid sequence (SEQ ID NO:22) of vep ORFI.

[0098]FIG. 24. Schematic diagram of the desosamine biosynthetic pathwayand the enzymatic activity associated with each of the desosaminebiosynthetic polypeptides.

[0099]FIG. 25. Schematic of the conversion of the inactive(diglycosylated) form of methymycin and pikromycin to the active form ofmethymycin and pikromycin.

[0100]FIG. 26. Schematic diagram of the desosamine biosynthetic pathway.

[0101]FIG. 27. Pathway for the synthesis of a compound of formula 7 and8 in desVI mutants of Streptomyces.

[0102]FIG. 28. The methymycin/pikromycin biosynthetic gene cluster andthe structure and biosynthesis of methymycin, neomethymycin, narbomycin,and pikromycin in S. venezuelae. Methymycin: R₁=OH, R₂=H, neomethymycin:R₁=H, R₂=OH; narbomycin R₃=H, pikromycin R₃=OH. Each circle representsan enzymatic domain in PKS protein. ACP, acyl carrier protein; KS,β-ketoacyl-ACP synthase; KS^(Q), a KS-like domain; AT, acyltransferase;KR, β-ketoacyl ACP reductase; DH, β-hydroxyl-thioester dehydratase; ER,enoyl reductase; TEI, thioesterase domain; TEII, type II thioesterase.Des represents all eight enzymes in desosamine synthesis and transferwhich include DesI, DesII, DesIII, DesIV, DesV, DesVI, DesVIII, andDesVII.

[0103]FIG. 29. Organization of the pik cluster in S. venezuelae. Eacharrow represents an open reading frame (ORF). The direction oftranscription and relative sizes of the ORFs deduced from nucleotidesequence are indicated. The cluster is composed of four genetic loci:pikA, pikB (des), pikC, and pikR. Cosmid clones are denoted asoverlapping lines.

[0104]FIG. 30. Conversion of YC-17 and narbomycin by PikC P450hydroxylase.

[0105]FIG. 31. Nucleotide sequence (SEQ ID NO:5) and inferred amino acidsequence (SEQ ID NO:6) of the pik gene cluster.

[0106]FIG. 32. Nucleotide sequence (SEQ ID NO:3) and inferred amino acidsequence (SEQ ID NO:4) of the desosamine gene cluster.

[0107]FIG. 33. S. venezuelae AX916 construct useful to prepare apolyketide having a shorter chain length compared to wild-type pikA. pikmodule 2 is fused to pik module 5, and module 3 and 4 are deleted, so asto encode a three module PKS which produces two macrolides, a triketideand a tetraketide.

[0108]FIG. 34. Recombinant PKS having a wild-type thioesterase II.

[0109]FIG. 35. pAX703 construct, an expression and complementationvector. The PikTEII gene can be replaced with an EcoRI-NsiI fragment.The phaC1 gene can be replaced with a PacI-DraI fragment.

[0110]FIG. 36. Strategy for C7 polymer production. mTEII is a mutantpikTEII, an acyl-ACP CoA transferase; phaC1 is a PHA polymerase 1 fromP. olivarus which may have racemase activity. In a strain having theseconstructs, AX916, a PHA polymer is produced.

[0111]FIG. 37. Strategy for C5 polymer production. A PHA polymerase genephaC1 is directly fused to pik module 2, so as to result in a fusionthat transfers an acyl chain from the PKS protein directly to thepolymerase by the prosthetic group on the ACP domain of the PKS.

[0112]FIG. 38. Codons for specified amino acids.

[0113]FIG. 39. Exemplary and preferred amino acid substitutions.

DETAILED DESCRIPTION OF THE INVENTION

[0114] The invention described herein can be used for the production ofa diverse range of biodegradable PHA polymers through genetic redesignof DNA encoding a FAS or a PKS such as that found in Streptomyces spp.Type I PKS polypeptide to provide a recombinant PHA monomer synthase.Different PHA synthases can then be tested for their ability topolymerize the monomers produced by the recombinant PHA synthase into abiodegradable polymer. The invention also provides a method by whichvarious PHA synthases can be tested for their specificity with respectto different monomer substrates.

[0115] The potential uses and applications of PHAs produced by PHAmonomer synthases and PHA synthases include both medical and industrialapplications. Medical applications of PHAs include surgical pins,sutures, staples, swabs, wound dressings, blood vessel replacements,bone replacements and plates, stimulation of bone growth bypiezoelectric properties, and biodegradable carrier for long-term dosageof pharmaceuticals. Industrial applications of PHAs include disposableitems such as baby diapers, packaging containers, bottles, wrappings,bags, and films, and biodegradable carriers for long-term dosage ofherbicides, fungicides, insecticides, or fertilizers.

[0116] In animals, the biosynthesis of fatty acids de novo frommalonyl-CoA is catalyzed by FAS. For example, the rat FAS is a homodimerwith a subunit structure consisting of 2505 amino acid residues having amolecular weight of 272,340 Da. Each subunit consists of seven catalyticactivities in separate physical domains (Amy et al., Proc. Natl. Acad.Sci. USA, 86, 3114 (1989)). The physical location of six of thecatalytic activities, ketoacyl synthase (KS), malonyl/acetyltransferase(M/AT), enoyl reductase (ER), ketoreductase (KR), acyl carrier protein(ACP), and thioesterase (TE), has been established by (1) theidentification of the various active site residues within the overallamino acid sequence by isolation of catalytically active fragments fromlimited proteolytic digests of the whole FAS, (2) the identification ofregions within the FAS that exhibit sequence similarity with variousmonofunctional proteins, (3) expression of DNA encoding an amino acidsequence with catalytic activity to produce recombinant proteins, and(4) the identification of DNA that does not encode catalytic activity,i.e., DNA encoding a linker region. (Smith et al., Proc. Natl. Acad.Sci. USA, 73, 1184 (1976); Tsukamoto et al., J. Biol. Chem., 23, 16225(1988); Rangan et al., J. Biol. Chem., 266, 19180 (1991)).

[0117] The seventh catalytic activity, dehydrase (DH), was identified asphysically residing between AT and ER by an amino acid comparison of FASwith the amino acid sequences encoded by the three open reading framesof the eryA polyketide synthase (PKS) gene cluster of Saccharopolysporaerythraea. The three polypeptides that comprise this PKS are constructedfrom “modules” which resemble animal FAS, both in terms of their aminoacid sequence and in the ordering of the constituent domains (Donadio etal., Gene, 111, 51 (1992); Benh et al., Eur. J. Biochem., 204, 39(1992)).

[0118] One embodiment of the invention employs a FAS in which the DH isinactivated (FAS DH-). The FAS DH- employed in this embodiment of theinvention is preferably a eukaryotic FAS DH- and, more preferably, amammalian FAS DH-. The most preferred embodiment of the invention is aFAS where the active site in the DH has been inactivated by mutation.For example, Joshi et al. (J. Biol. Chem., 268, 22508 (1993)) changedthe His⁸⁷⁸ residue in the rat FAS to an alanine residue by site-directedmutagenesis. In vitro studies showed that a FAS with this change(ratFAS206) produced 3-hydroxybutyrylCoA as a premature terminationproduct from acetyl-CoA, malonyl-CoA and NADPH.

[0119] As shown below, a FAS DH- effectively replaces the β-ketothiolaseand acetoacetyl-CoA reductase activities of the natural pathway byproducing D(−)-3-hydroxybutyrate as a premature termination product,rather than the usual 16-carbon product, palmitic acid. This prematuretermination product can then be incorporated into PHB by a PHB synthase(See Example 2).

[0120] Another embodiment of the invention employs a recombinantStreptomyces spp. PKS to produce a variety of β-hydroxyCoA esters thatcan serve as monomers for a PHA synthase. One example of a DNA encodinga Type I PKS is the eryA gene cluster, which governs the synthesis oferythromycin aglycone deoxyerythronolide B (DEB). The gene clusterencodes six repeated units, termed modules or synthase units (SUs). Eachmodule or SU, which comprises a series of putative FAS-like activities,is responsible for one of the six elongation cycles required for DEBformation. Thus, the processive synthesis of asymmetric acyl chainsfound in complex polyketides is accomplished through the use of aprogrammed protein template, where the nature of the chemical reactionsoccurring at each point is determined by the specificities in each SU.

[0121] Two other Type I PKS are encoded by the tyl (tylosin) (FIG. 4)and met (methymycin) (FIG. 5) gene clusters. The macrolidemultifunctional synthases encoded by tyl and met provide a greaterdegree of metabolic diversity than that found in the eryA gene cluster.The PKSs encoded by the eryA gene cluster only catalyze chain elongationwith methylmalonylCoA, as opposed to tyl and met PKSs, which catalyzechain elongation with malonylCoA, methylmalonylCoA and ethylmalonylCoA.Specifically, the tyl PKS includes two malonylCoA extender units and oneethylmalonylCoA extender unit, and the met PKS includes one malonylCoAextender unit. Thus, a preferred embodiment of the invention includes,but is not limited to, replacing catalytic activities encoded in met PKSopen reading frame 1 (ORF1) to provide a DNA encoding a protein thatpossesses the required keto group processing capacity and short-chainacylCoA ester starter and extender unit specificity necessary to providea saturated β-hydroxyhexanoylCoA or unsaturated β-hydroxyhexenoylCoAmonomer.

[0122] In order to manipulate the catalytic specificities within eachmodule, DNA encoding a catalytic activity must remain undisturbed. Toidentify the amino acid sequences between the amino acid sequences withcatalytic activity, the “linker regions,” amino acid sequences ofrelated modules, preferably those encoded by more than one gene cluster,are compared. Linker regions are amino acid sequences which are lesswell conserved than amino acid sequences with catalytic activity.Witkowski et al., Eur. J. Biochem., 198, 571 (1991).

[0123] In an alternative embodiment of the invention, to provide a DNAencoding a Type I PKS module with a TE and lacking a functional DH, aDNA encoding a module F, containing KS, MT, KR, ACP, and TE catalyticactivities, is introduced at the 3′ end of a DNA encoding a first module(FIG. 6). Module F introduces the final (R)-3-hydroxyl acyl group at thefinal step of PHA monomer synthesis, as a result of the presence of a TEdomain. DNA encoding a module F is not present in the eryA PKS genecluster (Donadio et al., supra, 1991).

[0124] A DNA encoding a recombinant monomer synthase is inserted into anexpression vector. The expression vector employed varies depending onthe host cell to be transformed with the expression vector. That is,vectors are employed with transcription, translation and/orpost-translational signals, such as targeting signals, necessary forefficient expression of the genes in various host cells into which thevectors are introduced. Such vectors are constructed and transformedinto host cells by methods well known in the art. See Sambrook et al.,Molecular Cloning: A Laboratory Manual, Cold Spring Harbor (1989).Preferred host cells for the vectors of the invention include insect,bacterial, and plant cells. Preferred insect cells include Spodopterafrugiperda cells such as Sf21, and Trichoplusia ni cells. Preferredbacterial cells include Escherichia coli, Streptomyces and Pseudomonas.Preferred plant cells include monocot and dicot cells, such as maize,rice, wheat, tobacco, legumes, carrot, squash, canola, soybean, potato,and the like.

[0125] Moreover, the appropriate subcellular compartment in which tolocate the enzyme in eukaryotic cells must be considered whenconstructing eukaryotic expression vectors. Two factors are important:the site of production of the acetyl-CoA substrate, and the availablespace for storage of the PHA polymer. To direct the enzyme to aparticular subcellular location, targeting sequences may be added to thesequences encoding the recombinant molecules.

[0126] The baculovirus system is particularly amenable to theintroduction of DNA encoding a recombinant FAS or a PKS monomer synthasebecause an increasing variety of transfer plasmids are becomingavailable which can accommodate a large insert, and the virus can bepropagated to high titers. Moreover, insect cells are adapted readily tosuspension culture, facilitating relatively large-scale recombinantprotein production. Further, recombinant proteins tend to be producedexclusively as soluble proteins in insect cells, thus, obviating theneed for refolding, a task that might be particularly daunting in thecase of a large multifunctional protein. The Sf21/baculovirus system hasroutinely expressed milligram quantities of catalytically activerecombinant fatty acid synthase. Finally, the baculovirus/insect cellsystem provides the ability to construct and analyze different synthaseproteins for the ability to polymerize monomers into uniquebiodegradable polymers.

[0127] A further embodiment of the invention is the introduction of atleast one DNA encoding a PHA synthase and a DNA encoding a PHA monomersynthase into a host cell. Such synthases include, but are not limitedto, A. eutrophus 3-hydroxy, 4-hydroxy, and 5-hydroxy alkanoatesynthases, Rhodococcus ruber C₃-C₅ hydroxyalkanoate synthases,Pseudomonas oleororans C₆-C₁₄ hydroxyalkanoate synthases, P. putidaC₆-C₁₄ hydroxyalkanoate synthases, P. aeruginosa C₅-C₁₀ hydroxyalkanoatesynthases, P. resinovorans C₄-C₁₀ hydroxyalkanoate synthases,Rhodospirillum rubrum C₄-C₇ hydroxyalkanoate syntheses, R. gelatinorusC₄-C₇ , Thiocapsa pfennigii C₄-C₈ hydroxyalkanoate synthases, andBacillus megaterium C₄-C₅ hydroxyalkanoate synthases.

[0128] The introduction of DNA(s) encoding more than one PHA synthasemay be necessary to produce a particular PHA polymer due to thespecificities exhibited by different PHA synthases. As multifunctionalproteins are altered to produce unusual monomeric structures, synthasespecificity may be problematic for particular substrates. Although theA. eutrophus PHB synthase utilizes only C4 and C5 compounds assubstrates, it appears to be a good prototype synthase for initialstudies since it is known to be capable of producing copolymers of3-hydroxybutyrate and 4-hydroxybutyrate (Kunioka et al., Macromolecules,22, 694 (1989)) as well as copolymers of 3-hydroxyvalerate,3-hydroxybutyrate, and 5-hydroxyvalerate (Doi et al., Macromolecules,19, 2860 (1986)). Other synthases, especially those of Pseudomonasaeruginosa (Timm et al., Eur. J. Biochem., 209, 15 (1992)) andRhodococcus ruber (Pieper et al., FEMS Microbiol. Lett., 96, 73 (1992)),can also be employed in the practice of the invention. Synthasespecificity may be alterable through molecular biological methods.

[0129] In yet another embodiment of the invention, a DNA encoding a FASand a PHA synthase can be introduced into a single expression vector,obviating the need to introduce the genes into a host cell individually.

[0130] A further embodiment of the invention is the generation of a DNAencoding a recombinant multifunctional protein, which comprises a FAS,of either eukaryotic or prokaryotic origin, and a PKS module F. Module Fwill carry out the final chain extension to include two additionalcarbons and the reduction of the β-keto group, which results in a(R)-3-hydroxy acyl CoA moiety.

[0131] To produce this recombinant protein, DNA encoding the FAS TE isreplaced with a DNA encoding a linker region which is normally found inthe ACP-KS interdomain region of bimodular ORFs. DNA encoding a module Fis then inserted 3′ to the DNA encoding the linker region. Differentlinker regions, such as those described below which vary in length andamino acid composition, can be tested to determine which linker mostefficiently mediates or allows the required transfer of the nascentsaturated fatty acid intermediate to module F for the final chainelongation and keto reduction steps. The resulting DNA encoding theprotein can then be tested for expression of long-chain β-hydroxy fattyacids in insect cells, such as Sf21 cells, or Streptomyces, orPseudomonas. The expected 3-hydroxy C-18 fatty acid can serve as apotential substrate for PHA synthases which are able to acceptlong-chain alkyl groups. A preferred embodiment of the invention is aFAS that has a chain length specificity between 4-22 carbons.

[0132] Examples of linker regions that can be employed in thisembodiment of the invention include, but are not limited to, the ACP-KSlinker regions encoded by the tyl ORF1 (ACP₁-KS₂; ACP₂-KS₃), and ORF3(ACP₅-KS₆), and eryA ORF1 (ACP₁-KS₁; ACP₂-KS₂), ORF2 (ACP₃-KS₄) and ORF3(ACP₅-KS₆).

[0133] This approach can also be used to produce shorter chain fattyacid groups by limiting the ability of the FAS unit to generatelong-chain fatty acids. Mutagenesis of DNA encoding various FAScatalytic activities, starting with the KS, may result in the synthesisof short-chain (R)-3-hydroxy fatty acids.

[0134] The PHA polymers are then recovered from the biomass. Large-scalesolvent extraction can be used, but is expensive. An alternative methodinvolving heat shock with subsequent enzymatic and detergent digestiveprocesses is also available (Byron, Trends Biotechnical, 5, 246 (1987);Holmes, In: Developments in Crystalline Polymers, D. C. Bassett (ed.),pp. 1-65 (1988)). PHB and other PHAs are readily extracted frommicroorganisms by chlorinated hydrocarbons. Refluxing with chloroformhas been extensively used; the resulting solution is filtered to removedebris and concentrated, and the polymer is precipitated with methanolor ethanol, leaving low-molecular-weight lipids in solution. Longerside-chain PHAs show a less restricted solubility than PHB and are, forexample, soluble in acetone. Other strategies adopted include the use ofethylene carbonate and propylene carbonate as disclosed by Lafferty etal. (Chem. Rundschau, 30, 14 (1977)) to extract PHB from biomass.Scandola et al. (Int. J. Biol. Microbiol., 10, 373 (1988)) reported that1 M HCl-chloroform extraction of Rhizobium meliloti yielded PHB ofM_(w)=6×10⁴ compared with 1.4×10⁶ when acetone was used.

[0135] Methods are well known in the art for the determination of thePHB or PHA content of microorganisms, the composition of PHAs, and thedistribution of the monomer units in the polymer. Gas chromatography andhigh-pressure liquid chromatography are widely used for quantitative PHBanalysis. See Anderson et al., Microbiol. Rev., 54, 450 (1990) for areview of such methods. NMR techniques can also be used to determinepolymer composition, and the distribution of monomer units.

[0136] Preparation of Variant Nucleic Acid Molecules and VariantPolypeptides of the Invention

[0137] The present invention also contemplates nucleic acid sequenceswhich hybridize under stringent hybridization conditions to the nucleicacid sequences set forth herein. Stringent hybridization conditions arewell known in the art and define a degree of sequence identity greaterthan about 80 to about 90%. Thus, nucleic acid sequences encodingvariant polypeptides (FIG. 38), or nucleic acid sequences havingconservative (silent) nucleotide substitutions (FIG. 37), are within thescope of the invention. Preferably, variant polypeptides encoded by thenucleic acid sequences of the invention are biologically active. Thepresent invention also contemplates naturally occurring allelicvariations and mutations of the nucleic acid sequences described herein.

[0138] As is well known in the art, because of the degeneracy of thegenetic code, there are numerous other DNA and RNA molecules that cancode for the same polypeptides as those encoded by the exemplifiedbiosynthetic genes and fragments thereof. The present invention,therefore, contemplates those other DNA and RNA molecules which, onexpression, encode the polypeptides of, for example, portions of SEQ IDNO:4 or SEQ ID NO:6. Having identified the amino acid residue sequenceencoded by a sugar biosynthetic or macrolide biosynthetic gene, and withknowledge of all triplet codons for each particular amino acid residue,it is possible to describe all such encoding RNA and DNA sequences. DNAand RNA molecules other than those specifically disclosed herein and,which molecules are characterized simply by a change in a codon for aparticular amino acid, are within the scope of this invention.

[0139] The 20 common amino acids and their representative abbreviations,symbols and codons are well known in the art (see, for example,Molecular Biology of the Cell, Second Edition, B. Alberts et al.,Garland Publishing Inc., New York and London, 1989). As is also wellknown in the art, codons constitute triplet sequences of nucleotides inmRNA molecules and as such, are characterized by the base uracil (U) inplace of base thymidine (T) which is present in DNA molecules. A simplechange in a codon for the same amino acid residue within apolynucleotide will not change the structure of the encoded polypeptide.By way of example, it can be seen from SEQ ID NO:6 that a TCT codon forserine exists at nucleotide positions 1735-1737. However, it can also beseen from that same sequence that serine can be encoded by a TCA codon(see, e.g., nucleotide positions 1738-1740) and a TCC codon (see, e.g.,nucleotide positions 1874-1876). Substitution of the latter codons forserine with the TCT codon for serine or vice versa, does notsubstantially alter the DNA sequence of SEQ ID NO:6 and results inproduction of the same polypeptide. In a similar manner, substitutionsof the recited codons with other equivalent codons can be made in a likemanner without departing from the scope of the present invention.

[0140] A nucleic acid molecule, segment or sequence of the presentinvention can also be an RNA molecule, segment or sequence. An RNAmolecule contemplated by the present invention corresponds to, iscomplementary to or hybridizes under stringent conditions to any of theDNA sequences set forth herein. Exemplary and preferred RNA moleculesare mRNA molecules that encode sugar biosynthetic or macrolidebiosynthetic enzymes of this invention.

[0141] Mutations can be made to the native nucleic acid sequences of theinvention and such mutants used in place of the native sequence, so longas the mutants are able to function with other sequences to collectivelycatalyze the synthesis of an identifiable polyketide or macrolides. Suchmutations can be made to the native sequences using conventionaltechniques such as by preparing synthetic oligonucleotides including themutations and inserting the mutated sequence into the gene usingrestriction endonuclease digestion. (See, e.g., Kunkel, T. A. Proc.Natl. Acad. Sci. USA (1985) 82:448; Geisselsoder et al. BioTechniques(1987) 5:786.) Alternatively, the mutations can be effected using amismatched primer (generally 10-20 nucleotides in length) whichhybridizes to the native nucleotide sequence (generally cDNAcorresponding to the RNA sequence), at a temperature below the meltingtemperature of the mismatched duplex. The primer can be made specific bykeeping primer length and base composition within relatively narrowlimits and by keeping the mutant base centrally located. Zoller andSmith, Methods Enzymol., (1983) 100:468. Primer extension is effectedusing DNA polymerase, the product cloned and clones containing themutated DNA, derived by segregation of the primer extended strand,selected. Selection can be accomplished using the mutant primer as ahybridization probe. The technique is also applicable for generatingmultiple point mutations. See, e.g., Dalbie-McFarland et al., Proc.Natl. Acad. Sci. USA (1982) 79:6409. PCR mutagenesis will also find usefor effecting the desired mutations.

[0142] Random mutagenesis of the nucleotide sequence can be accomplishedby several different techniques known in the art, such as by alteringsequences within restriction endonuclease sites, inserting anoligonucleotide linker randomly into a plasmid, by irradiation withX-rays or ultraviolet light, by incorporating incorrect nucleotidesduring in vitro DNA synthesis, by error-prone PCR mutagenesis, bypreparing synthetic mutants or by damaging plasmid DNA in vitro withchemicals. Chemical mutagens include, for example, sodium bisulfite,nitrous acid, hydroxylamine, agents which damage or remove bases therebypreventing normal base-pairing such as hydrazine or formic acid,analogues of nucleotide precursors such as nitrosoguanidine,5-bromouracil, 2-aminopurine, or acridine intercalating agents such asproflavine, acriflavine, quinacrine, and the like. Generally, plasmidDNA or DNA fragments are treated with chemicals, transformed into E.coli and propagated as a pool or library of mutant plasmids.

[0143] Large populations of random enzyme variants can be constructed invivo using “recombination-enhanced mutagenesis.” This method employs twoor more pools of, for example, 10⁶ mutants each of the wild-typeencoding nucleotide sequence that are generated using any convenientmutagenesis technique and then inserted into cloning vectors.

[0144] The gene sequences can be inserted into one or more expressionvectors, using methods known to those of skill in the art. Expressionvectors may include control sequences operably linked to the desiredgenes. Suitable expression systems for use with the present inventioninclude systems which function in eukaryotic and prokaryotic host cells.Prokaryotic systems are preferred, and in particular, systems compatiblewith Streptomyces spp. are of particular interest. Control elements foruse in such systems include promoters, optionally containing operatorsequences, and ribosome binding sites. Particularly useful promotersinclude control sequences derived from the gene clusters of theinvention. However, other bacterial promoters, such as those derivedfrom sugar metabolizing enzymes, such as galactose, lactose (lac) andmaltose, will also find use in the expression cassettes encodingdesosamine. Additional examples include promoter sequences derived frombiosynthetic enzymes such as tryptophan (trp), the β-lactamase (bla)promoter system, bacteriophage lambda PL, and T5. In addition, syntheticpromoters, such as the tac promoter (U.S. Pat. No. 4,551,433), which donot occur in nature, also function in bacterial host cells.

[0145] Other regulatory sequences may also be desirable which allow forregulation of expression of the genes relative to the growth of the hostcell. Regulatory sequences are known to those of skill in the art, andexamples include those which cause the expression of a gene to be turnedon or off in response to a chemical or physical stimulus, including thepresence of a regulatory compound. Other types of regulatory elementsmay also be present in the vector, for example, enhancer sequences.

[0146] Selectable markers can also be included in the recombinantexpression vectors. A variety of markers are known which are useful inselecting for transformed cell lines and generally comprise a gene whoseexpression confers a selectable phenotype on transformed cells when thecells are grown in an appropriate selective medium. Such markersinclude, for example, genes which confer antibiotic resistance orsensitivity to the plasmid. Alternatively, several polyketides arenaturally colored and this characteristic provides a built-in marker forselecting cells successfully transformed by the present constructs.

[0147] The various subunits of interest can be cloned into one or morerecombinant vectors as individual cassettes, with separate controlelements, or under the control of, e.g., a single promoter. The subunitscan include flanking restriction sites to allow for the easy deletionand insertion of other subunits so that hybrid PKSs can be generated.The design of such unique restriction sites is known to those of skillin the art and can be accomplished using the techniques described above,such as site-directed mutagenesis and PCR.

[0148] For sequences generated by random mutagenesis, the choice ofvector depends on the pool of mutant sequences, i.e., donor orrecipient, with which they are to be employed. Furthermore, the choiceof vector determines the host cell to be employed in subsequent steps ofthe claimed method. Any transducible cloning vector can be used as acloning vector for the donor pool of mutants. It is preferred, however,that phagemids, cosmids, or similar cloning vectors be used for cloningthe donor pool of mutant encoding nucleotide sequences into the hostcell. Phagemids and cosmids, for example, are advantageous vectors dueto the ability to insert and stably propagate therein larger fragmentsof DNA than in M13 phage and λ phage, respectively. Phagemids which willfind use in this method generally include hybrids between plasmids andfilamentous phage cloning vehicles. Cosmids which will find use in thismethod generally include λ phage-based vectors into which cos sites havebeen inserted. Recipient pool cloning vectors can be any suitableplasmid. The cloning vectors into which pools of mutants are insertedmay be identical or may be constructed to harbor and express differentgenetic markers (see, e.g., Sambrook et al., supra). The utility ofemploying such vectors having different marker genes may be exploited tofacilitate a determination of successful transduction.

[0149] Thus, for example, the cloning vector employed may be a phagemidand the host cell may be E. coli. Upon infection of the host cell whichcontains a phagemid, single-stranded phagemid DNA is produced, packagedand extruded from the cell in the form of a transducing phage in amanner similar to other phage vectors. Thus, clonal amplification ofmutant encoding nucleotide sequences carried by phagemids isaccomplished by propagating the phagemids in a suitable host cell.

[0150] Following clonal amplification, the cloned donor pool of mutantsis infected with a helper phage to obtain a mixture of phage particlescontaining either the helper phage genome or phagemids mutant alleles ofthe wild-type encoding nucleotide sequence.

[0151] Infection, or transfection, of host cells with helper phage isgenerally accomplished by methods well known in the art (see., e.g.,Sambrook et al., supra; and Russell et al. (1986) Gene 45:333-338).

[0152] The helper phage may be any phage which can be used incombination with the cloning phage to produce an infective transducingphage. For example, if the cloning vector is a cosmid, the helper phagewill necessarily be a λ phage. Preferably, the cloning vector is aphagemid and the helper phage is a filamentous phage, and preferablyphage M13.

[0153] If desired after infecting the phagemid with helper phage andobtaining a mixture of phage particles, the transducing phage can beseparated from helper phage based on size difference (Barnes et al.(1983) Methods Enzymol 101:98-122), or other similarly effectivetechnique.

[0154] The entire spectrum of cloned donor mutations can now betransduced into clonally amplified recipient cells into which has beentransduced or transformed a pool of mutant encoding nucleotidesequences. Recipient cells which may be employed in the method disclosedand claimed herein may be, for example, E. coli, or other bacterialexpression systems which are not recombination deficient. Arecombination deficient cell is a cell in which recombinatorial eventsis greatly reduced, such as rec⁻ mutants of E. coli (see, Clark et al.(1965) Proc. Natl. Acad. Sci. USA 53:451-459).

[0155] These transductants can now be selected for the desired expressedprotein property or characteristic and, if necessary or desirable,amplified. Optionally, if the phagemids into which each pool of mutantsis cloned are constructed to express different genetic markers, asdescribed above, transductants may be selected by way of theirexpression of both donor and recipient plasmid markers.

[0156] The recombinants generated by the above-described methods canthen be subjected to selection or screening by any appropriate method,for example, enzymatic or other biological activity.

[0157] The above cycle of amplification, infection, transduction, andrecombination may be repeated any number of times using additional donorpools cloned on phagemids. As above, the phagemids into which each poolof mutants is cloned may be constructed to express a different markergene. Each cycle could increase the number of distinct mutants by up toa factor of 10⁶. Thus, if the probability of occurrence of aninter-allelic recombination event in any individual cell is f (aparameter that is actually a function of the distance between therecombining mutations), the transduced culture from two pools of 10⁶allelic mutants will express up to 10¹² distinct mutants in a populationof 10¹²/f cells.

I. Experimental Procedures

[0158] Materials and Methods

[0159] Materials. Sodium R-(−)-3-hydroxybutyrate, coenzyme-A,ethylchloroformate, pyridine and diethyl ether were purchased from SigmaChemical Co. Amberlite IR-120 was purchased from Mallinckrodt Inc.6-O-(N-Heptylcarbarnoyl)methyl α-D-glycopyranoside (Hecameg) wasobtained from Vegatec (Villeejuif, France). Two-piece spectrophotometercells with pathlengths of 0.1 (#20/0-Q-1) and 0.01 cm (#20/0-Q-0.1) wereobtained from Starna Cells Inc. (Atascadero, Calif.). Rabbit anti-A.eutrophus PHA synthase antibody was a gracious gift from Dr. F. Sriencand S. Stoup (Biological Process Technology Institute, University ofMinnesota). Sf21 cells and T ni cells were kindly provided by GregFranzen (R&D Systems, Minneapolis, Minn.) and Stephen Harsch (Departmentof Veterinary Pathobiology, University of Minnesota), respectively.

[0160] Plasmid pFAS206 and a recombinant baculoviral clone encodingFAS206 (Joshi et al., J. Biol. Chem., 268, 22508 (1993)) were generousgifts of A. Joshi and S. Smith. Plasmid pAet4l (Peoples et al., J. Biol.Chem., 264, 15298 (1989)), the source of the A. eutrophus PHB synthase,was obtained from A. Sinskey. Baculovirus transfer vector, pBacPAK9, andlinearized baculoviral DNA, were obtained from Clontech Inc. (Palo Alto,Calif.). Restriction enzymes, T4 DNA ligase, E. coli DH5α competentcells, molecular weight standards, lipofectin reagent, Grace's insectcell medium, fetal bovine serum (FBS), and antibiotic/antimycoticreagent were obtained from GIBCO-BRL (Grand Island, N.Y.). Tissueculture dishes were obtained from Corning Inc. Spinner flasks wereobtained from Bellco Glass Inc. Seaplaque agarose GTG was obtained fromFMC Bioproducts Inc.

[0161] Methods

[0162] Preparation of R-3HBCoA. R-(−)-3 HBCoA was prepared by the mixedanhydride method described by Haywood et al., FEMS Microbiol, Lett., 51,1 (1989). 60 mg (0.58 nmol) of R-(−)-3 hydroxybutyric acid was freezedried and added to a solution of 72 mg of pyridine in 10 ml diethylether at 0° C. Ethylchloroformate (100 mg) was added, and the mixturewas allowed to stand at 4° C. for 60 minutes. Insoluble pyridinehydrochloride was removed by centrifugation. The resulting anhydride wasadded, dropwise with mixing, to a solution of 100 mg coenzyme-A (0.13nmol) in 4 ml 0.2 M potassium bicarbonate, pH 8.0 at 0° C. The reactionwas monitored by the nitroprusside test of Stadtman, Meth. Enzymol., 3,931 (1957), to ensure sufficient anhydride was added to esterify all thecoenzyme-A. The concentration of R-3-HBCoA was determined by measuringthe absorbance at 260 nm (e=16.8 nM⁻¹ cm⁻¹; 18).

[0163] Construction of pBP-phbC. The phbC gene (approximately 1.8 kb)was excised from pAet4l (Peoples et al., J. Biol. Chem., 264, 15293(1989)) by digestion with BstBI and StuI, purified as described byWilliams et al. (Gene, 109, 445 (1991)), and ligated to pBacPAK9digested with BstBI and StuI. This resulted in pBP-phbC, the baculovirustransfer vector used in formation of recombinant baculovirus particlescarrying phbC.

[0164] Large-scale expression of PHA synthase. A 1 L culture of T nicells (1.2×10⁶ cells/ml) in logarithmic growth was infected by theaddition of 50 ml recombinant viral stock solution (2.5×10⁸ pfu/ml)resulting in a multiplicity of infection (MOD of 10. This infectedculture was split between two Bellco spinners (350 ml/500 ml spinner,700 ml/1 L spinner) to facilitate oxygenation of the culture. Thesecultures were incubated at 28° C. and stirred at 60 rpm for 60 hours.Infected cells were harvested by centrifugation at 1000× g for 10minutes at 4° C. Cells were flash frozen in liquid N₂ and stored in 4equal aliquots, at −80° C. until purification.

[0165] Insect cell maintenance and recombinant baculovirus formation.Sf21 cells were maintained at 26-28° C. in Grace's insect cell mediumsupplemented with 10% FBS, 1.0% pluronic F68, and 1.0%antibiotic/antimycotic (GIBCO-BRL). Cells were typically maintained insuspension at 0.2-2.0×10⁶/ml in 60 ml total culture volume in 100 mlspinner flasks at 55-65 rpm. Cell viability during the culture periodwas typically 95-100%. The procedures for use of the transfer vector andbaculovirus were essentially those described by the manufacturer(Clontech, Inc.). Purified pBP-phbC and linearized baculovirus DNA wereused for cotransfection of Sf21 cells using the liposome-mediated method(Felgner et al., Proc. Natl. Acad. Sci. USA, 84, 7413 (1987)) utilizingLipofectin (GIBCO-BRL). Four days later cotransfection supernatants wereutilized for plaque purification. Recombinant viral clones were purifiedfrom plaque assay plates containing 1.5% Seaplaque GTG after 5-7 days at28° C. Recombinant viral clone stocks were then amplified in T25-flaskcultures (4 ml, 3×10⁶/ml on day 0) for 4 days; infected cells weredetermined by their morphology and size and then screened by SDS/PAGEusing 10% polyacrylamide gels (Laemmli, Nature, 227, 680 (1970)) forproduction of PHA synthase.

[0166] Purification of PHA synthase from BTI-TN-5BI-4 T. ni cells.Purification of PHA synthase was performed according to the method ofGerngross et al., Biochemistry, 33, 9311 (1994) with the followingalterations. One aliquot (110 mg protein) of frozen cells was thawed onice and resuspended in 10 mM KPi (pH 7.2), 5% glycerol, and 0.05%Hecameg (Buffer A) containing the following protease inhibitors at theindicated final concentrations: benzamidine (2 mM), phenylmethylsulfonylfluoride (PMSF, 0.4 mM), pepstatin (2 mg/ml), leupeptin (2.5 mg/ml), andNa-p-tosyl-l-lysine chloromethyl ketone (TLCK, 2 mM). EDTA was omittedat this stage due to its incompatibility with hydroxylapatite (HA). Thismixture was homogenized with three series of 10 strokes each in twoThomas homogenizers while partially submerged in an ice bath and thensonicated for 2 minutes in a Branson Sonifier 250 at 30% cycle, 30%power while on ice. All subsequent procedures were carried out at 4° C.

[0167] The lysate was immediately centrifuged at 100000× g in a Beckman50.2Ti rotor for 80 minutes, and the resulting supernatant (10.5 ml, 47mg) was immediately filtered through a 0.45 mm Uniflow filter(Schleicher and Schuell Inc., Keene, N. H.) to remove any remaininginsoluble matter. Aliquots of the soluble fraction (1.5 ml, 7 mg) wereloaded onto a 5 ml BioRad Econo-Pac HTP column that had beenequilibrated with Buffer A (+protease inhibitor mix) attached to aBioRad Econo-system, and the column was washed with 30 ml Buffer A. Allchromatographic steps were carried out at a flow rate of 0.8 ml/minute.PHA synthase was eluted form the HA column with a 32×32 ml lineargradient from 10 to 300 mM KPi.

[0168] Fraction collection tubes were prepared by addition of 30 ml of100 mM EDTA to provide a metalloprotease inhibitor at 1 mM immediatelyafter HA chromatography. PHA synthase was eluted in a broad peak between110-180 mM KPi. Fractions (3 ml) containing significant PHA synthaseactivity were pooled and stored at 0° C. until the entire solublefraction had been run through the chromatographic process. Pooledfractions then were concentrated at 4° C. by use of a Centriprep-30concentrator (Amicon) to 3.8 mg/ml. Aliquots (0.5 ml) were either flashfrozen and stored in liquid N₂ or glycerol was added to a finalconcentration of 50% and samples (1.9 mg/ml) were stored at −20° C.

[0169] Western analysis. Samples of T. ni cells were fractionated bySDS-PAGE on 10% polyacrylamide gels, and the proteins then weretransferred to 0.2 mm nitrocellulose membranes using a BioRad TransblotSD Semi-Dry electrophoretic transfer cell according to the manufacturer.Proteins were transferred for 1 hour at 15 V. The membrane was rinsedwith doubly distilled H₂O, dried, and treated with phosphate-bufferedsaline (PBS) containing 0.05% Tween-20 (PBS-Tween) and 3% nonfat drymilk to block non-specific binding sites. Primary antibody (rabbitanti-PHA synthase) was applied in fresh blocking solution and incubatedat 25° C. for 2 hours. Membranes were then washed four times for 10minutes with PBS-Tween followed by the addition of horseradishperoxidase-conjugated goat-anti-rabbit antibody (Boehringer-Mannheim)diluted 10,000× in fresh blocking solution and incubated at 25° C. for 1hour. Membranes were washed finally in three changes (10 minutes) ofPBS, and the immobilized peroxidase label was detected using thechemiluminescent LumiGLO substrate kit (Kirkegaard and Perry,Gaithersburg, Md.) and X-ray film.

[0170] N-terminal analysis. Approximately 10 mg of purified PHA synthasewas run on a 10% SDS-polyacrylamide gel, transferred to PVDF(Immobilon-PSQ, Millipore Corporation, Bedford, Mass.), stained withAmido Black, and sequenced on a 494 Procise Protein Sequencer(Perkin-Elmer, Applied Biosystems Division, Foster City, Calif.).

[0171] Double-infection protocol. Four 100 ml spinner flasks were eachinoculated with 8×10⁷ cells in 50 ml of fresh insect medium. To flask 1,an additional 20 ml of fresh insect medium was added (uninfectedcontrol); to flask 2, 10 ml BacPAK6::phbC viral stock (1×10⁸ pfu/ml) and10 ml fresh insect medium were added; to flask 3, 10 ml BacPAK6::FAS206viral stock (1×10⁸ pfu/ml) and 10 ml fresh insect medium were added; andto flask 4, 10 ml BacPAK6::phbC viral stock (1×10⁸ pfu/ml) and 10 mlBacPAK6::FAS206 viral stock (1×10⁸ pfu/ml) were added. These viralinfections were carried out at a multiplicity of infection ofapproximately 10. Cultures were maintained under normal growthconditions and 15 ml samples were removed at 24, 48, and 72 hour timepoints. Cells were collected by gentle centrifugation at 1000× g for 5minutes, the medium was discarded, and the cells were immediately storedat −70° C.

[0172] PHA synthase assays. Coenzyme A released by PHA synthase in theprocess of polymerization was monitored precisely as described byGerngross et al. (supra) using 5,5′-dithiobis (2-nitrobenzoic acid,DTNB) (Ellman, Arch. Biochem Biophys., 82, 70 (1959)).

[0173] The presence of HBCoA was monitored spectrophotometrically.Assays were performed at 25° C. in a Hewlett Packard 8452A diode arrayspectrophotometer equipped with a water-jacketed cell holder. Two-pieceStarna Spectrosil spectrophotometer cells with pathlengths of 0.1 and0.01 cm were employed to avoid errors arising from the compression ofthe absorbance scale at higher values. Absorbance was monitored at 232μm, and E₂₃₂ nm of 4.5×103 M⁻¹ cm⁻¹ was used in calculations. One unit(U) of enzyme is the amount required to hydrolyze 1 mmol of substrateminute⁻¹. Buffer (0.15 M KPi, pH 7.2) and substrate were equilibrated to25° C. and then combined in an Eppendorf tube also at 25° C. Enzyme wasadded and mixed once in the pipet tip used to transfer the entiremixture to the spectrophotometer cell. The two-piece cell wasimmediately assembled, placed in the spectrophotometer with the cellholder (type CH) adapted for the standard 10 mm pathlength cell holderof the spectrophotometer. Manipulations of sample, from mixing toinitiation of monitoring, took only 10-15 seconds. Absorbance wascontinually monitored for up to 10 minutes. Calibration of reactions wasagainst a solution of buffer and enzyme (no substrate) which led toabsorbance values that represented substrate only.

[0174] PHB assay. PHB was assayed from Sf21 cell samples according tothe propanolysis method of Riis et al., J. Chromo., 445, 285 (1988).Cell pellets were thawed on ice, resuspended in 1 ml cold ddH₂O andtransferred to 5 ml screwtop test tubes with teflon seals. Two ml ofddH₂O were added, the cells were washed and centrifuged and then 3 ml ofacetone were added and the cells washed and centrifuged. The sampleswere then desiccated by placing them in a 94° C. oven for 12 hours. Thefollowing day 0.5 ml of 1,2-dichloroethane, 0.5 ml acidified propanol(20 ml HCl, 80 ml 1-propanol) and 50 ml benzoic acid standard were addedand the sealed tubes were heated to 100° C. in a boiling water bath for2 hours with periodic vortexing. The tubes were cooled to roomtemperature and the organic phase was used for gas-chromatographic (GC)analysis using a Hewlett Packard 5890A gas chromatograph equipped with aHewlett Packard 7673A automatic injector and a fused silica capillarycolumn, DB-WAX 30W of 30 meter length. Positive samples were furthersubjected to GC-mass spectrometric (MS) analysis for the presence ofpropylhydroxybutyrate using a Kratos MS25 GC/MS. The followingparameters were used: source temperature, 210° C.; voltage, 70 eV; andaccelerating voltage, 4 KeV.

[0175] Catalytic Activities

[0176] Ketoacyl synthase (KS) activity was assessed radiochemically bythe condensation-¹⁴CO₂ exchange reaction (Smith et al., PNAS USA, 13,1184 (1976)).

[0177] Transferase (AT) activity was assayed, using malonyl-CoA as donorand pantetheine as acceptor, by determining spectrophotometrically thefree CoA released in a coupled ATP citrate-lyase-malate dehydrogenasereaction (see, Rangen et al., J. Biol., Chem., 266, 19180 (1991).

[0178] Ketoreductase (KR) was assayed spectrophotometrically at 340 nm:assay systems contained 0.1 M potassium phosphate buffer (pH 7), 0.15 mMNADPH, enzyme and either 10 mM trans-1-decalone or 0.1 mMacetoacetyl-CoA substrate.

[0179] Dehydrase (DH) activity was assayed spectrophotometrically at 270nm using S-DL-β-hydyroxybutyryl N-acetylcysteamine as substrate (Kumaret al., J. Biol., Chem., 245,4732 (1970)).

[0180] Enoyl reductase (ER) activity was assayed spectrophotometricallyat 340 nm essentially as described by Strom et al. (J. Biol. Chem., 254,8159 (1979)); the assay system contained 0.1 M potassium phosphatebuffer (pH 7), 0.15 mM NADPH, 0.375 nM crotonoyl-CoA, 20 μM CoA andenzyme.

[0181] Thioesterase (TE) activity was assessed radiochemically byextracting and assaying the [¹⁴C]palmitic acid formed from[1-¹⁴C]palmitoyl-CoA during a 3 minute incubation Smith, Meth. Enzymol.,71C, 181(1981); the assay was in a final volume of 0.1 ml, 25 mMpotassium phosphate buffer (pH 8), 20 μM [1-¹⁴C]palmitoyl-CoA (20 nCi)and enzyme.

[0182] Assay of overall fatty acid synthase activity was performedspectrophotometrically as described previously by Smith et al. (Meth.Enzymol., 35, 65 (1975)). All enzyme activities were assayed at 37° C.except the transferase, which was assayed at 20° C. Activity unitsindicate nmol of substrate consumed/minute. All assays were conducted,at a minimum, at two different protein concentrations with theappropriate enzyme and substrate blanks included.

II. EXAMPLES Example 1 Expression of A. Eutrophus PHA Synthase Using aBaculovirus System

[0183] Recent work has shown that PHA synthase from A. eutrophus can beoverexpressed in E. coli, in the absence of 3-ketothiolase andacetoacetyl-CoA reductase (Gerngross et al., supra) and can be expressedin plants (See Poirier et al., Biotech, 13, 142 (1995) for a review).Isolation of the soluble form of PHA synthase provides opportunities toexamine the mechanistic details of the priming and initiation reactions.Because the baculovirus system has been successful for the expression ofa number of prokaryotic genes as soluble proteins, and insect cells,unlike bacterial expression systems, carry out a wide array ofpost-translational modifications, the baculovirus expression systemappeared ideal for the expression of large quantities of soluble PHAsynthase, a protein that must be modified by phosphopantetheine in orderto be catalytically active (Gerngross et al., supra).

[0184] Purification of PHA synthase. The purification procedure employedfor PHA synthase is a modification of Gerngross et al. (supra) involvingthe elimination of the second liquid chromatographic step and inclusionof a protease-inhibitor cocktail in all buffers. All steps were carriedout on ice or at 4° C. except where noted. Frozen cells were thawed onice in 10 ml of Buffer A (10 mM KPi, pH 7.2, 05% glycerol, and 0.05%Hecameg) and then immediately homogenized prior to centrifugation and HAchromatography.

[0185] The results of these efforts are summarized in Table 1 and FIG.7. A prominent band at 64 kDa is visible in total, soluble, and HAeluate protein samples fractionated by SDS/PAGE (lanes 4, 5, and 6 ofFIG. 7, respectively). The initial specific activity of the isolated PHAsynthase was 20-fold higher than previous attempts at expression andpurification of this polypeptide. Approximately 1000 units of PHBsynthase have been purified, based on calculations from the directspectrophotometric assay detailed below, with an overall recovery ofactivity of 70%. The large proportion of synthase present in themembrane fraction, and the fact that over 90% of the initial activitywas found in the soluble fraction, suggest either that the synthase inthe membrane fraction is in an inactive form or that the direct assay isnot applicable to the initial, 12 U/mg, crude extract. TABLE 1Purification of PHA Synthase protein specific sample total units vol(mL) (mg) (mg/ml) activity recovery total 1430 11.5 113  9.8 12.7 100 protein soluble 1340 10.5 47 4.5 28.6 93 protein pooled 1020 7.9 30 3.834.2 71 HA fractions

[0186] N-terminal sequencing of the 64 kDa protein confirmed itsidentity as PHA synthase (FIG. 8). Two prominent N-termini, at aminoacid residue 7 (alanine) and residue 10 (serine) were obtained in a 3:2ratio. This heterogeneous N-terminus presumably is the result ofaminopeptidase activity. Western analysis using a rabbit-anti-PHAsynthase antibody corroborated the results of the sequencing andindicated the presence of at least three bands that resulted fromproteolysis of PHA synthase (FIG. 7B, lanes 4-6). The antibody wasspecific for PHA synthase since neither T. ni nor baculoviral proteinsshowed reactivity (FIG. 7B, lanes 2 and 3). N-terminal proteinsequencing (FIG. 8) showed directly that the 44 kDa (band b) and 32 kDa(band d) proteins were derived from PHA synthase (fragments beginning atA181/N185 and at G387, respectively). The 35-40 kDa (band c) proteingave low sequencing yields and may contain a blocked N-terminus.Inspection of FIG. 7B suggests that most degradation occurs followingcell disruption since the total protein sample of this gel (lane 4) wasprepared by boiling intact cells directly in SDS sample buffer while theHA sample (lane 6) went through the purification procedure describedabove.

[0187] Assay of synthase activity. Due to the significant level ofexpression obtained using the baculovirus system, the synthase activitycould be assayed spectrophotometrically by monitoring hydrolysis of thethioester bond at 232 nm, the wavelength at which there is a maximumdecrease in absorbance upon hydrolysis. The difference between substrate(HBCoA) and product (CoA) at this wavelength is shown in FIG. 9.Absorbance of HBCoA and CoA at 232 nm occurs at a trough between twowell-separated peaks. Assays were carried out at pH 7.2 for comparativeanalysis with previous studies (Gerngross et al., supra). Substrate(R-(−)3-HBCoA) substrate for these studies was prepared using the mixedanhydride method (Haywood et al., supra), and its concentration wasdetermined by measuring A₂₆₀. The short pathlength cells (0.1 cm and0.01 cm) allowed use of relatively high reaction concentrations whileconserving substrate and enzyme. Assay results showed an initial lagperiod of 60 seconds prior to the linear decrease in A₂₃₂, andvelocities were determined from the slope of these linear regions of theassay curves. The length of the lag period was variable and wasinversely related to enzyme concentration. These data are consistentwith those using PHA synthase purified from E. coli (Gerngross et al.,supra).

[0188]FIGS. 10 and 11 show the V versus S and 1/V versus 1/S plots,respectively. The double reciprocal plot was concave upward which issimilar to results obtained from studies of the granular PHA synthasefrom Zooglea ramigera (Fukui et al., Arch. Microbiol., 110, 149 (1976))and suggests a complex reaction mechanism. Examinations of velocity andspecific activity as a function of enzyme concentration are shown inFIGS. 12 and 13. These results confirm that specific activity of thesynthase depends upon enzyme concentration. The pH activity curve for A.eutrophus PHA synthase purified from T. ni cells is shown in FIG. 14.The curve shows a broad activity maximum centered around pH 8.5. Thisresult agrees well with prior work on the A. eutrophus PHB synthasealthough it is significantly different than results obtained for the PHBsynthase from Z. ramigera for which the optimum was determined to be pH7.0.

[0189] The effect of varying enzyme concentration in the presence of afixed amount of substrate revealed an intriguing trend (FIG. 15). Fromthese data it appears that the extent of polymerization is dependent onthe amount of enzyme included in the reaction mixture. This could beexplained if there is a “terminal length” limitation of the polymer,which, once reached, cannot be extended any further. If this is thecase, it would also suggest that termination of the polymerizationreaction, the release of the synthase from the polymer, and/orreinitiation of polymerization by the newly released synthase arerelatively slow events since no evidence of these reactions are seenwithin the time course of these studies. The phenomenon observed in FIG.15 is not the result of decay of the enzyme over the course of the assaysince virtually identical results are obtained following a 10 minutepreincubation of the synthase at 25° C.

[0190] It must also be noted that comparisons of the directspectrophotometric assays used here and the more common assay involvingthe use of Ellman's reagent, DTNB, (Ellman, supra) in the formation ofthiolate of coenzyme-A showed that the values determined by the directmethod were approximately 70% of the values determined using Ellman'sreagent. This may be due to phase separation occurring in the cuvettesas the relatively insoluble polymer is formed. In support of thisnotion, a faint haze or opalescence in the cuvette developed during thecourse of the reaction, particularly at higher substrate concentrations.

[0191] PHA synthase purified from insect cells appears to be relativelystable. Examination of activity following storage, in liquid N₂ and at−20° C. in the presence of 50% glycerol showed that approximately 50% ofsynthase activity remained after 7 weeks when stored in liquid N₂ andapproximately 75% of synthase activity remained after 7 weeks whenstored at −20° C. in the presence of 50% glycerol.

[0192] The expression of PHA synthase from A. eutrophus in a baculovirusexpression system results in the synthase constituting approximately 50%of total protein 60 hours post-infection; however, approximately 50-75%of the synthase is observed in the membrane-associated fraction. Thiselevated level of expression allowed purification of the soluble PHAsynthase using a single chromatographic step on HA. The purity of thispreparation is estimated to be approximately 90% (intact PHA synthaseand 3 proteolysis products).

[0193] The initial specific activity of 12 U/mg was approximately20-fold higher than the most successful previous efforts atoverexpression of A. eutrophus PHA synthase. The synthase reported herewas isolated from a 250 ml culture with 70% recovery which represents animprovement of 500-fold (1000 U/64 U×8 L/0.25 L) when compared to an 8 LE. coli culture with 40% recovery. This high expression level shouldprovide sufficient PHA synthase for extensive structural, functional,and mechanistic studies. Furthermore, it is clear that the baculovirusexpression system is an attractive option for isolation of other PHAsynthases from various sources.

[0194] PHA synthase produced in the baculovirus system was of sufficientpotency to allow direct spectrophotometric analysis of the hydrolysis ofthe thioester bond of HBCoA at 232 nm. These assays revealed a lagperiod of approximately 60 seconds, the length of which was variable andinversely related to enzyme concentration. Such a lag period presumablyreflects a slow step in the reaction, perhaps correlating todimerization of the enzyme, the priming, and/or initiation steps information of PHB. Size exclusion chromatographic examination of the PHBsynthase native MW indicated two forms of the synthase. One form showeda MW of approximately 100-160 kDa and the other showed a MW ofapproximately 50-80 kDA; these two forms likely represent the dimer andmonomer of PHA synthase, respectively. Similar results have beenreported previously in which two forms of approximately 60 and 130 kDawere observed. Comparisons of the direct assay reported here and theindirect assay using DTNB revealed that the former resulted in valuesthat were 70% of the values determined by the DTNB indirect assay.Although the reason for this difference has not been examined in detail,it is probable that the apparent phase separation that occurred upon PHBformation in the short pathlength cuvettes used, particularly with high[HBCoA], results in this discrepancy.

[0195] Enzymatic analyses of the PHA synthase have found that the enzymehas a broad pH optimum centered at pH 8.5; however, the studiesdescribed herein have been performed at pH 7.2 to provide comparativevalues with the results of others. Moreover, the specific activity ofthis enzyme is dependent upon enzyme concentration which confirms andextends earlier results (Gerngross et al., supra).

[0196] In studies intended to examine the dependence of activity uponenzyme concentration, it became apparent that the extent of thepolymerization reaction is dependent on the amount of enzyme included inthe reaction mixture. Specifically, decreasing the amount of enzymeleads not only to decreased velocity of reaction but also to a decreasedextent of condensation (FIG. 15). One possible explanation is that theenzyme is thermally labile; however, identical assays in which theenzyme is preincubated at 25° C. for 10 minutes prior to initiation ofthe reaction had similar results. Another possibility is that aterminal-length of the polymer is reached precluding furthercondensations until the particular synthase molecule is released fromthe terminal-length polymer.

[0197] This work clearly demonstrates the value of the baculovirusexpression system for the production of A. eutrophus PHA synthase andfor the potential application to studies of other PHA synthases.Furthermore, the high level of expression obtained using the baculoviralsystem should allow convenient analysis for substrate-specificity andstructure-function studies of PHA synthases from relatively crude insectcell extracts.

Example 2 Co-Expression of Rat FAS Dehydrase Mutant cDNA and PHBSynthase Gene in Insect Cells

[0198] Expression of a rat FAS DH- cDNA in Sf9 cells has been reportedpreviously (Rangan et al., J. Biol. Chem., 266, 19180 (1991); Joshi etal., Biochem. J., 296, 143 (1993)). Once activity of the phbC geneproduct had been established in insect cells (see Example 1),baculovirus clones containing the rat FAS DH- cDNA and BacPAK6::phbCwere employed in a double-infection strategy to determine if PHB wouldbe produced in insect cells. It was not known if an intracellular poolof R(−)-3-hydroxybutyrate would be stable or available as a substratefor the PHB synthase. In order for the R-(-+3-hydroxybutyrylCoA to beavailable as a substrate, the R-(−)-3-hydroxybutyrylCoA released fromrat FAS DH- protein must be trapped by the PHB synthase and incorporatedinto a polymer at a rate faster than β-oxidation, which would regenerateacetylCoA. It was also not known if the stereochemical configuration ofthe 3-hydroxyl group, which must be in the R form, would be recognizedas a substrate by PHB synthase. Fortunately, previous biochemicalstudies on eukaryotic FASs indicated that the R form of3-hydroxybutyrylCoA would be generated (Wakil et al., J. Biol. Chem.,237, 687 (1962)).

[0199] SDS-PAGE of protein samples from a time course of uninfected,single-infected, and dual-infected Sf21 cells was performed (FIG. 16).From these data, it is clear that the rat FAS DH mutant and PHB synthasepolypeptides are efficiently co-expressed in Sf21 cells. However,co-expression results in ˜50% reduced levels of both polypeptidescompared to Sf21 cells that are producing the individual proteins.Western analysis using anti-rat FAS (Rangan et al., supra) and anti-PHAsynthase antibodies confirmed simultaneous production of thecorresponding proteins.

[0200] To provide further evidence that PHB was being synthesized ininsect cells, T. ni cells which had been infected with a baculovirusvector encoding rat FAS DH⁰ and/or a baculovirus vector encoding PHAsynthase were analyzed for the presence of granules. Infected cells werefixed in paraformaldehyde and incubated with anti-PHA synthaseantibodies (Williams et al., Protein Exp. Purif., 7, 203 (1996)).Granules were observed only in doubly infected cells (Williams et al.,App. Environ., Micro., 62, 2540 (1996)).

[0201] Characterization of PHB production in insect cells. In order todetermine if de novo synthesis of PHB was occurring in Sf21 cells thatco-express the rat FAS DH mutant and PHB synthase, fractions of thesesamples were extracted, the extract subjected to propanolysis, andanalyzed for the presence of propylhydroxybutyrate by gas chromatography(FIG. 17). A unique peak with a retention time that coincided with apropylhydroxybutyrate standard was detected only in the double infectionsamples at 48 and 72 hours, in contrast to the individually expressedgene products and uninfected controls, which were negative. Thesesamples were analyzed further by GC/MS to confirm the identity of theproduct. FIG. 18 shows mass spectroscopy data corresponding to thematerial obtained from peak 10.1 in the gas chromatograph compared to apropylhydroxybutyrate standard. The results show that PHB synthesis isoccurring only in Sf21 cells co-expressing the rat FAS DH mutant cDNAand the phbC gene from A. eutrophus. Integration of the peak in the gaschromatograph corresponding to propylhydroxybutyrate revealed thatapproximately 1 mg of PHB was isolated from 1 liter culture of Sf21cells (approximately 600 mg dry cell weight of Sf21 cells). Thus, theratFAS206 protein effectively replaces the β-ketothiolase andacetoacetyl-CoA reductase functions, resulting in the production of PHBby a novel pathway.

[0202] The approach described here provides a new strategy to combinemetabolic pathways that are normally engaged in primary anabolicfunctions for production of polyesters. The premature termination of thenormal fatty acid biosynthetic pathway to provide suitably modifiedacylCoA monomers for use in PHA synthesis can be applied to bothprokaryotic and eukaryotic expression since the formation of polymerwill not be dependent on specialized feedstocks. Thus, once arecombinant PHA monomer synthase is introduced into a prokaryotic oreukaryotic system, and co-expressed with the appropriate PHA synthase,novel bipolymer formation can occur.

Example 3 Cloning and Sequencing of the Vep ORFI PKS Gene Cluster

[0203] The entire PKS cluster form Streptomyces venezuelae was clonedusing a heterologous hybridization strategy. A 1.2 kb DNA fragment thathybridized strongly to a DNA encoding an eryA PKS β-ketoacyl synthasedomain was cloned and used to generate a plasmid for gene disruption.This method generated a mutant strain blocked in the synthesis of theantibiotic. A S. venezuelae genomic DNA library was generated and usedto clone a cosmid containing the complete methymycin aglycone PKS DNA.Fine-mapping analysis was performed to identify the order and sequenceof catalytic domains along the multifunctional PKS (FIG. 19). DNAsequence analysis of the vep ORF1 showed that the order of catalyticdomains is KS^(Q)/AT/ACP/KS/AT/KR/ACP/KS/AT/DH/KR/ACP. The complete DNAsequence, and corresponding amino acid sequence, of the vep ORFI isshown in FIG. 23 (SEQ ID NO:1 and SEQ ID NO:2, respectively).

[0204] The sequence data indicated that the PKS gene cluster encodes apolyene of twelve carbons. The vep gene cluster contains 5 polyketidesynthase modules, with a loading module at its 5′ end and an endingdomain at its 3′ end. Each of the sequenced modules includes a keto-ACP(KS), an acyltransferase (AT), a dehydratase (DH), a keto-reductase(KR), and an acyl carrier protein domain. The six acyltransferasedomains in the cluster are responsible for the incorporation of sixacetyl-CoA moieties into the product. The loading module contains aKS^(Q), an AT and an ACP domain. KS^(Q) refers to a domain that ishomologous to a KS domain except that the active site cysteine (C) isreplaced by glutamine (Q). There is no counterpart to the KS^(Q) domainin the PKS clusters which have been previously characterized.

[0205] The ending domain (ED) is an enzyme which is responsible for theattachment of the nascent polyketide chain onto another molecule. Theamino acid sequence of ED resembles an enzyme, HetM, which is involvedin Anabaena heterocyst formation. The homology between vep and HetMsuggests that the polypeptide encoded by the vep gene cluster maysynthesize a polyene-containing composition which is present in thespore coat or cell wall of its natural host, S. venezuelae.

Example 4 Preparation of a Vector Encoding a Saturated β-hydroxyhexanoylCoA Monomer or an Unsaturated β-hydroxyhexanoyl CoA Monomer

[0206] To provide a recombinant monomer synthase that generates asaturated β-hydroxyhexanoylCoA or unsaturated β-hydroxyhexanoylCoAmonomer, the linear correspondence between the genetic organization ofthe Type I macrolide PKS and the catalytic domain organization in themultifunctional proteins is assessed (Donadio et al., supra, 1991; Katzet al., Ann. Rev. Microbiol., 47, 875 (1993)). First, a DNA encoding aTE is added to the 3′ end of an ORF1 of a Type I PKS, preferably the metORF I (FIG. 6) as recently described by Cortes et al. (Science, 268,1487 (1995)) in the erythromycin system. To ensure that the DNA encodingthe TE is completely active, DNA encoding a linker region separating anormal ACP-TE region in a PKS, for example, the one found in met PKSORF5 (FIG. 5), will be incorporated into the DNA. The resulting vectorcan be introduced into a host cell and the TE activity, rate of releaseof the CoA product, and identity of the fatty acid chain determined.

[0207] The acyl chain that is most likely to be released is the CoAester, specifically the 3-hydroxy-4-methyl heptenoylCoA ester, since thefully elongated chain is presumably released in this form prior tomacrolide cyclization. If the CoA form of the acyl chain is notobserved, then a gene encoding a CoA ligase will be cloned andco-expressed in the host cell to catalyze formation of the desiredintermediate.

[0208] There is clear precedent for release of the predicted prematuretermination products from mutant strains of macrolide-producingStreptomyces that produce intermediates in macrolide synthesis (Huber etal., Antimicrob. Agents Chemother., 34, 1535 (1990); Kinoshita et al.,J. Chem. Soc., Chem. Comm., 14, 943 (1988)). The structure of theseintermediates is consistent with the linear organization of functionaldomains in macrolide PKSs, particularly those related to eryA, tyl, andmet. Other known PKS gene clusters include, but are not limited to, thegene cluster encoding 6-methylsalicylic acid synthase (Beck et al., Eur.J. Biochem., 192, 487 (1990)), soraphen A (Schupp et al., J. Bacteriol.,177, 3673 (1995)), and sterigmatocystin (Yu et al., J. Bacteriol., 177,4792 (1995)).

[0209] Once the release of the 3-hydroxy-4-methyl heptenoylCoA ester isestablished, DNA encoding the extender unit AT in met module 1 isreplaced to change the specificity from methylmalonylCoA to malonylCoA(FIGS. 4-6). This change eliminates methyl group branching in theβ-hydroxy acyl chain. While comparison of known AT amino acid sequencesshows high overall amino acid sequence conservation, distinct regionsare readily apparent where significant deletions or insertions haveoccurred. For example, comparison of malonyl and methylmalonyl aminoacid sequences reveals a 37 amino acid deletion in the central region ofthe malonyl transferase. Thus, to change the specificity of themethylmalonyl transferase to malonyl transferase, the met ORF1 DNAencoding the 37 amino acid sequence of MMT will be deleted, and theresulting gene will be tested in a host cell for production of thedesmethyl species, 3-hydroxyheptenoylCoA. Alternatively, the DNAencoding the entire MMT can be replaced with a DNA encoding an intact MTto affect the desired chain construction.

[0210] After replacing MMT with MT, DNA encoding DH/ER will beintroduced into DNA encoding met ORFI module 1. This modificationresults in a multifunctional protein that generates a methylene group atC-3 of the acyl chain (FIG. 6). The DNA encoding DH/ER will be PCRamplified from the available eryA or tyl PKS sequences, including theDNA encoding the required linker regions, employing a primer pair toconserved sequences 5′ and 3′ of the DNA encoding DH/ER. The PCRfragment will then be cloned into the met ORFI. The result is a DNAencoding a multifunctional protein (MT* DH/ER*TE*). This proteinpossesses the full complement of keto group processing steps and resultsin the production of heptenoylCoA.

[0211] The DNA encoding dehydrase in met module 2 is then inactivated,using site-directed mutagenesis in a scheme similar to that used togenerate the rat FAS DH- described above (Joshi et al., J. Biol., Chem.,26, 22508 (1993)). This preserves the required (R)-3-hydroxy group whichserves as the substrate for PHA synthases and results in(R)-3-hydroxyheptanoylCoA species.

[0212] The final domain replacement will involve the DNA encoding thestarter unit acyltransferase in met module 1 (FIG. 5), to change thespecificity from propionyl CoA to acetyl CoA. This shortens the(R)-3-hydroxy acyl chain from heptanoyl to hexanoyl. The DNA encodingthe catalytic domain will need to be generated based on a FAS or6-methylsalicylic acid synthase model (Beck et al., Eur. J. Biochem.,192, 487 (1990)) or by using site-directed mutagenesis to alter thespecificity of the resident met PKS propionyltransferase sequence.Limiting the initiator species to acetylCoA can result in the use ofthis starter unit by the monomer synthase. Previous work with macrolidesynthases have shown that some are able to accept a wide range ofstarter unit carboxylic acids. This is particularly well documented foravermectin synthase, where over 60 new compounds have been produced byaltering the starter unit substrate in precursor feeding studies (Duttonet al., J. Antibiotics, 44, 357 (1991)).

Example 5 Preparation of a Vector Encoding a Recombinant MonomerSynthase that Synthesizes 3-hydroxyl-4-hexenoic Acid

[0213] To provide a recombinant monomer synthase that synthesizes3-hydroxyl-4-hexenoic acid, a precursor for polyhydroxyhexenoate, theDNA segment encoding the loading and the first module of the vep genecluster was linked to the DNA segment encoding module 7 of the tyl genecluster so as to yield a recombinant DNA molecule encoding a fusionpolypeptide which has no amino acid differences relative to thecorresponding amino acid sequence of the parent modules. The fusionpolypeptide catalyzes the synthesis of 3-hydroxyl-4-hexenoic acid. Therecombinant DNA molecule was introduced into SCP2, a Streptomycesvector, under the control of the act promoter (pDHS502, FIG. 20). Apolyhydroxyalkanoate polymerase gene, phaC1 from Pseudomonas oleavorans,was then introduced downstream of the recombinant PKS cluster (pDHS505;FIGS. 22 and 23). The DNA segment encoding the polyhydroxyalkanoatepolymerase is linked to the DNA segment encoding the recombinant PKSsynthase so as to yield a fusion polypeptide which synthesizespolyhydroxyhexenoate in Streptomyces. Polyhydroxyhexenoate, abiodegradable thermoplastic, is not naturally synthesized inStreptomyces, or as a major product in any other organism. Moreover, theunsaturated double bond in the side chain of polyhydroxyhexenoate mayresult in a polymer which has superior physical properties as abiodegradable thermoplastic over the known polyhydroxyalkanoates.

Example 6 Deletion of the desR Gene of the Desosamine Biosynthetic GeneCluster

[0214] As some macrolides have more than one attached sugar moiety, theassignment of sugar biosynthetic genes to the appropriate sugarbiosynthetic pathway can be quite difficult. Since methymycin (acompound of formula (1)) and neomethymycin (a compound of formula (2))(FIG. 24) (Donin et al., 1953; Djerassi et al., 1956), two closelyrelated macrolide antibiotics produced by Streptomyces venezuelae,contain desosamine as their sole sugar component, the organization ofthe sugar biosynthetic genes in the methymycin/neomethymycin genecluster may be less complicated. Thus, this system was chosen for thestudy of the biosynthesis of desosamine, aN,N-dimethylamino-3,4,6-trideoxyhexose, which also exists in theerythromycin structure (Flinn et al., 1954).

[0215] To study the formation of this unusual sugar, a DNA library wasconstructed by partially digesting the genomic DNA of S. venezuelae(ATCC 15439) with Sau3A I into 35-40 kb fragments which were ligatedinto the cosmid vector pNJ1 (Tuan et al., 1990). The recombinant DNA waspackaged into bacteriophage λ which was used to transfect E. coli DH5α.The resulting cosmid library was screened for desired clones using thetylA1 and tylA2 genes from the tylosin biosynthetic cluster as probes(Baltz et al., 1988; Merson-Davies et al., 1994). These two probes arespecific for sugar biosynthetic genes whose products catalyze the firsttwo steps universally followed by all unusual 6-deoxyhexoses studiedthus far. The initial reaction involves conversion ofglucose-1-phosphate to TDP-D-glucose by α-D-glucose-1-phosphatethymidylyltransferase (TylA1) and subsequently, TDP-D-glucose istransformed to TDP-4-keto-6-deoxy-D-glucose by TDP-D-glucose4,6-dehydratase (TylA2). Three cosmids were found to contain geneshomologous to tylA1 and tylA2. Further analysis of these cosmids led tothe identification of nine open reading frames (ORFs) downstream of thePKS genes (FIG. 24). Based on sequence similarities to other sugarbiosynthetic genes, especially those derived form the erythromycincluster (Gaisser et al., 1997; Summers et al., 1997), eight of thesenine ORFs are believed to be involved in the biosynthesis ofTDP-D-desosamine. Interestingly, the ery cluster lacks homologs of thetylA1 and tylA2 genes that are responsible for the first two steps indesosamine pathway. It is possible that the erythromycin biosyntheticmachinery may rely on a general cellular pool ofTDP-4-keto-6-deoxy-D-glucose for mycarose and desosamine formation.Depicted in FIG. 24 is a biosynthetic pathway for TDP-D-desosamine.

[0216] Although eight of the nine ORFs have been assigned to desosamineformation, the presence of desR, which shows strong sequence homology toβ-glucosidases (as high as 39% identity and 46% similarity) (Castle etal., 1998), within the desosamine gene cluster is puzzling. Toinvestigate the function of DesR relative to the biosynthesis ofmethymycin/neomethymycin, a disruption plasmid (pBL1005) derived frompKC1139 (containing an apramycin resistance marker) (Bierman et al.,1992) was constructed in which a 1.0 kb NcoI/XhoI fragment of the desRgene was deleted and replaced by the thiostrepton resistance (tsr) gene(1.1 kb) (Bibb et al., 1985) via blunt-end ligation. This plasmid wasused to transform E. coli S17-1, which serves as the donor strain tointroduce the pBL1005 construct through conjugal transfer into thewild-type S. venezuelae (Bierman et al., 1992). The double crossovermutants in which chromosomal desR had been replaced with the disruptedgene were selected according to their thiostrepton-resistant andapramycin-sensitive characteristics. Southern blot hybridizationanalysis was used to confirm the gene replacement.

[0217] The desired mutant was first grown at 29° C. in seed medium for48 hours, and then inoculated and grown in vegetative medium for another48 hours (Cane et al., 1993). After the fermentation broth wascentrifuged at 10,000 g to remove cellular debris and mycelia, thesupernatant was adjusted to pH 9.5 with concentrated KOH, and extractedwith an equivolume of chloroform (four times). The organic layer wasdried over sodium sulfate and evaporated to dryness. The amber oil-likecrude products were first subjected to flash chromatography on silicagel using a gradient of 0-40% methanol in chloroform, followed by HPLCpurification on a C₁₈ column eluted isocratically with 45% acetonitrilein 57 mM ammonium acetate (pH 6.7). In addition to methymycin (acompound of formula (1)) and neomethymycin (a compound of formula (2)),two new products were isolated. The yield of a compound of formula (13)and a compound of formula (14) was each in the range of 5-10 mg/L offermentation broth. However, a compound of formula (1) and a compound offormula (2) remained to be the major products. High-resolution FAB-MSrevealed that both compounds have identical molecular compositions thatdiffer from methymycin/neomethymycin by an extra hexose. The chemicalnature of these two new compounds were elucidated to be C-2′β-glucosylated methymycin and neomethymycin (a compound of formula (13)and formula (14), respectively) by extensive spectral analysis.

[0218] The spectral data of (13): ¹H NMR (acetone-d₆) δ 6.56 (1H, d,J=16.0, 9-H), 6.46 (1H, d, J=16.0, 8-H), 4.67 (1H, dd, J=10.8, 2.0,11-H), 4.39 (1H, d, J=7.5, 1′-H), 4.32 (1H, d, J=8.0, 1 ″-H), 3.99 (1H,dd, J=11.5, 2.5, 6″-H), 3.72 (1H, dd, J=11.5, 5.5, 6″-H), 3.56 (1H, m,5′-H), 3.52 (1H, d, J=10.0, 3-H), 3.37 (1H, t, J=8.5, 3″-H), 3.33 (1H,m, 5″-H), 3.28 (1H, t, J=8.5, 4″-H), 3.23 (1H, dd, J=10.5, 7.5, 2′-H),3.15 (1H, dd, J=8.5, 8.0, 2″-H), 3.10 (1H, m, 2-H), 2.75 (1H, 3′-H,buried under H₂O peak), 2.42 (1H, m, 6H), 2.28 (6H, s, NMe₂), 1.95 (1H,m, 12-H), 1.9 (1H, m, 5-H), 1.82 (1H, m, 4′-H), 1.50 (1H, m, 12-H), 1.44(3H, d, J=7.0, 2-Me), 1.4 (1H, m, 5-H), 1.34 (3H, s, 10-Me), 1.3 (1H, m,4-H), 1.25 (1H, m, 4′-H), 1.20 (3H, d, J=6.0, 5′-Me), 1.15 (3H, d,J=7.0, 6-Me), 0.95 (3H, d, J=6.0, 4-Me), 0.86 (3H, t, J=7.5, 12-Me).High-resolution FAB-MS: calc for C₃₁H₅₄NO₁₂ (M+H)⁺632.3646, found632.3686.

[0219] Spectral data of (14): ¹H NMR (acetone-d₆) δ 6.69 (1H, dd,J=16.0, 5.5 Hz, 9-H), 6.55 (1H, dd, J=16.0, 1.3, 8-H), 4.71 (1H, dd,J=9.0, 2.0, 11-H), 4.37 (1H, d, J=7.0, 1′-H), 4.31 (1H, d, J=8.0, 1″-H), 3.97 (1H, dd, J=11.5, 2.5, 6″-H), 3.81 (1H, dq, J=9.0, 6.0, 12-H),3.72 (1H, dd, J=11.5, 5.0, 6″-H), 3.56 (1H, m, 5′-H), 3.50 (1H, bd,J=10.0, 3-H), 3.36 (1H, t, J=8.5, 3″-H), 3.32 (1H, m, 5″-H), 3.30 (1H,t, J=8.5, 4″-H), 3.23 (1H, dd, J=10.2, 7.0, 2′-H), 3.13, (1H, dd, J=8.5,8.0, 2″-H), 3.09 (1H, m, 2-H), 3.08 (1H, m, 10-H), 2.77 (1H, ddd,J=12.5, 10.2, 4.5, 3′-H), 2.41 (1H, m, 6-H), 2.28 (6H, s, NMe₂), 1.89(1H, t, J=13.0,5-H), 1.83 (1H, ddd, J-12.5, 4.5, 1.5, 4′-H), 1.41 (3H,d, J=7.0, 2-Me), 1.3 (1H, m, 4-H), 1.25 (1H, m, 5-H), 1.2 (1H, m, 4′-H,1.20 (3H, d, J=6.0, 5′-Me), 1.17 (6H, d, J=7.0, 6-Me, 10-Me), 1.12 (3H,d, J=6.0, 12-me), 0.96 (3H, d, J=6.0, 4-Me). ¹³C NMR (acetone-d₆) δ204.1 (C-7), 175.8 (C-1), 148.2 (C-9), 126.7 (C-8), 108.3 (C-1″), 104.2(C-1′), 85.1 (C-3), 83.0 (C-2′), 78.2 (C-3″), 78.1 (C-5″), 76.6 (C-2″),76.4 (C-11), 71.8 (C-4″), 69.3 (C-5′), 66.1 (C-12), 66.0 (C-3′), 63.7(C-6″), 46.2 (C-6), 44.4 (C-2), 40.8 (NMe₂), 36.4 (C-10), 34.7 (C-5),34.0 (C-4), 29.5 (C-4′), 21.5 (5′-Me), 21.5 (12-Me), 17.9 (6-Me), 17.7(4-Me), 17.2 (2-Me), 9.9 (10-Me). High-resolution FAB-MS: calc forC₃₁H₅₄NO₁₂ (M+H)⁺632.3646, found 632.3648.

[0220] The coupling constant (d, J=8.0 Hz) of the anomeric hydrogen(1″-H) of the added glucose and the magnitude of the downfield shift(11.8 ppm) of C-2′ of desosamine are all consistent with the assignedC-2′ β-configuration (Seo et al., 1978).

[0221] The antibiotic activity of a compound of formula (13) and (14)against Streptococcus pyogenes was examined by separately applying 20 μLof each sample (1.6 mM in MeOH) to sterilized filter paper discs whichwere placed onto the surface of S. pyogenes grown on Mueller-Hinton agarplates (Mangahas, 1996). After being grown overnight at 37° C., theplates of the controls (a compound of formula (1) and (2)) showedclearly visible inhibition zones. In contrast, no such clearings werediscernible around the discs of a compound of formula (13) and (14).Evidently, β-glucosylation at C-2′ of desosamine inmethymycin/neomethymycin renders these antibiotics inactive.

[0222] It should be noted that similar phenomena involving inactivationof macrolide antibiotics by glycosylation are known (Celmer et al.,1985; Kuo et al., 1989; Sasaki et al., 1996). For example, it was foundthat when erythromycin was given to Streptomyces lividans, whichcontains a macrolide glycosyltransferase (MgtA), the bacterium was ableto defend itself by glycosylating the drug (Cundliffe, 1992; Jenkins etal., 1991). Such a macrolide glycosyltransferase activity has beendetected in 15 out of a total of 32 actinomycete strains producingvarious polyketide antibiotics (Sasaki et al., 1996). Interestingly, theco-existence of a macrolide glycosyltransferase (OleD) capable ofdeactivating oleandomycin by glucosylation (Hernandez et al., 1993), andan extracellular β-glucosidase capable of removing the added glucosefrom the deactivated oleandomycin in Streptomyces antibioticus (Vilcheset al., 1992) has led to the speculation of glycosylation as a possibleself-resistance mechanism in S. antibioticus. Although the genes of theaforementioned glycosyltransferases have been cloned in a few cases,such as mgtA of S. lividans and oleD of S. antibioticus, the whereaboutsof macrolide β-glycosidase genes remain obscure. Interestingly, therecently released eryBI sequence, which is part of the erythromycinbiosynthetic cluster, is highly homologous to desR (55% identity)(Gaisser et al., 1997).

[0223] The discovery of desR, a macrolide β-glucosidase gene, within thedesosamine gene cluster is thus significant, and the accumulation ofdeactivated compounds of formula (13) and (14) after desR disruptionprovides direct molecular evidence indicating that a similarself-defense mechanism via glycosylation/deglycosylation may also beoperative in S. venezuelae. However, because a significant amount ofmethymycin and neomethymycin also exist in the fermentation broth of themutant strain, glucosylation of desosamine may not be the primaryself-resistance mechanism in S. venezuelae. Indeed, an rRNAmethyltransferase gene found upstream from the PKS genes in this clustermay confer the primary self-resistance protection. Thus, these resultsare consistent with the fact that antibiotic producing organismsgenerally have more than one defensive option (Cundliffe, 1989). Inlight of this observation, it is conceivable thatmethymycin/neomethymycin may be produced in part as the inertdiglycosides (a compound of formula (13) or (14)), and the macrolideβ-glucosidase encoded by desR is responsible for transformingmethymycin/neomethymycin from their dormant state to their active form.Supporting this idea, the translated desR gene has a leader sequencecharacteristic of secretory proteins (von Heijne, 1986; von Heijne,1989). Thus, DesR may be transported through the cell membrane andhydrolyze the modified antibiotics extracellularly to activate them(FIG. 25).

[0224] Summary

[0225] Inspired by the complex assembly and the enzymology of aminodeoxysugars that are frequently found as essential components of macrolideantibiotics, the entire desosamine biosynthetic gene cluster from themethymycin and neomethymycin producing strain Streptomyces venezuelaewas cloned, sequenced, and mapped. Eight of the nine mapped genes wereassigned to the biosynthesis of TDP-D-desosamine based on sequencesimilarities to those derived from the erythromycin cluster. Theremaining gene, designated desR, showed strong sequence homology toβ-glucosidases.

[0226] To investigate the function of the encoded protein (DesR), adisruption mutant was constructed in which a NcoI/XhoI fragment of thedesR gene was deleted and replaced by the thiostrepton resistance (tsr)gene. In addition to methymycin and neomethymycin, two new products wereisolated from the fermentation of the mutant strain. These two newcompounds, which are biologically inactive, were found to be C-2′β-glucosylated methymycin and neomethymycin. Since the translated desRgene has a leader sequence characteristic of secretory proteins, theDesR protein may be an extracellular β-glucosidase capable of removingthe added glucose from the modified antibiotics to activate them. Thus,the occurrence of desR within the desosamine gene cluster and theaccumulation of deactivated glucosylated methymycin/neomethymycin upondisruption of desR provide strong molecular evidence suggesting that aself-resistance mechanism via glucosylation may be operative in S.venezuelae.

[0227] Thus, the desR gene can be used as a probe to identify homologsin other antibiotic biosynthetic pathways. Deletion of the correspondingmacrolide glycosidase gene in other antibiotic biosynthetic pathways maylead to the accumulation of the glycosylated products which may be usedas prodrugs with reduced cytotoxicity. Glycosylation also holds promiseas a tool to regulate and/or minimize the potential toxicity associatedwith new macrolide antibiotics produced by genetically engineeredmicroorganisms. Moreover, the availability of macrolide glycosidases,which can be used for the activation of newly formed antibiotics thathave been deliberately deactivated by engineered glycosyltransferases,may be useful in the development of novel antibiotics using thecombinatorial biosynthetic approach (Hopwood et al., 1990; Katz et al.,1993; Hutchinson et al., 1995; Carreras et al., 1997; Kramer et al.,1996; Khosla et al., 1996; Jacobsen et al., 1997; Marsden et al., 1998).

Example 7 Deletion of the desVI Gene of the Desosamine Biosynthetic GeneCluster

[0228] The emergence of pathogenic bacteria resistant to many commonlyused antibiotics poses a serious threat to human health and has been theimpetus of the present resurgent search for new antimicrobial agents(Box et al., 1997; Davies, 1996; Service, 1995). Since the first reporton using genetic engineering techniques to create “hybrid” polyketides(Hopwood et al., 1995), the potential of manipulating the genesgoverning the biosynthesis of secondary metabolites to create newbioactive compounds, especially macrolide antibiotics, has received muchattention (Kramer et al., 1996; Khosla et al., 1996). This class ofclinically important drugs consists of two essential structuralcomponents: a polyketide aglycone and the appended deoxy sugars (Omura,1984). The aglycone is synthesized via sequential condensations of acylthioesters catalyzed by a highly organized multi-enzyme complex,polyketide synthase (PKS) (Hopwood et al., 1990; Katz, 1993; Hutchinsonet al., 1995; Carreras et al., 1997). Recent advances in theunderstanding of the polyketide biosynthesis have allowed recombinationof the PKS genes to construct an impressive array of novel skeletons(Kramer et al., 1996; Khosla et al., 1996; Hopwood et al., 1990; Katz,1993; Hutchinson et al., 1995; Carreras et al., 1997; Epp et al., 1989;Donadio et al., 1993; Arisawa et al., 1994; Jacobsen et al., 1997;Marsden et al., 1998). Without the sugar components, however, these newcompounds are usually biologically impotent. Hence, if one plans to makenew macrolide antibiotics by a combinatorial biosynthetic approach, twoimmediate challenges must be overcome: assembling a repertoire of novelsugar structures and then having the capacity to couple these sugars tothe structurally diverse macrolide aglycones.

[0229] Unfortunately, knowledge of the formation of the unusual sugarsin these antibiotics remains limited (Liu et al., 1994; Kirschning etal., 1997; Johnson et al., 1998). Part of the reason for this comes fromthe fact that the sugar genes are generally scattered at both ends ofthe PKS genes. Such an organization within the macrolide biosyntheticgene cluster makes it difficult to distinguish the sugar genes fromthose encoding regulatory proteins or aglycone modification enzymes thatare also interspersed in the same regions. The task can be made evenmore formidable if the macrolides contain multiple sugar components. Inview of the “scattered” nature of the sugar biosynthetic genes, theantibiotic methymycin (a compound of formula (1) in FIG. 24) and itsco-metabolite, neomethymycin (a compound of formula (2) in FIG. 24)), ofStreptomyces venezuelae present themselves as an attractive system tostudy the formation of deoxy sugars (Donin et al., 1953; Djerassi etal., 1956). First, they carry D-desosamine (a compound of formula (3)) aprototypical aminodeoxy sugar that also exists in erythromycin. Second,since desosamine is the only sugar attached to the macrolactone offormula (1) and (2), identification of the sugar biosynthetic geneswithin the methymycin/neomethymycin gene cluster should be possible withmuch more certainty.

[0230] A 10 kb stretch of DNA downstream from themethymycin/neomethymycin gene cluster, which is about 60 kb in length,was found to harbor the entire desosamine biosynthetic gene cluster(FIG. 26). Among the nine open reading frames (ORFs) mapped in thissegment, eight are likely to be involved in desosamine formation, whilethe remaining one, desR, encodes a macrolide β-glycosidase that may beinvolved in a self-resistance mechanism. Their identities, shown in FIG.26, are assigned based on sequence similarities to other sugarbiosynthetic genes (Gaisser et al., 1997; Summers et al., 1997). Theproposed pathway is well founded on literature precedent and mechanisticintuition for the construction of aminodeoxy sugars (Liu et al., 1994;Kirschning et al., 1997; Johnson et al., 1998).

[0231] To determine whether new methymycin/neomethymycin analoguescarrying modified sugars could be generated by altering the desosaminebiosynthetic genes, the desVI gene, which has been predicted to encodethe N-methyltransferase, was chosen as a target (Gaisser et al., 1997;Summers et al., 1997). The deduced desVI product is most closely relatedto that of eryCVI from the erythromycin producing strainSaccharopolyspora erythraea (70% identity), and also strongly resemblesthe predicted products of rdmD from the rhodomycin cluster ofStreptomyces purpurascens (Niemi et al., 1995), srmX from the spiromycincluster of Streptomyces ambofaciens (Geistlich et al., 1992), and tylM1from the tylosin cluster of Streptomyces fradiae (Gandecha et al.,1997). All of these enzymes contain the consensus sequence LLDV(I)ACGTG(SEQ ID NO:25) (Gaisser et al., 1997; Summers et al., 1997), near theirN-terminus, which is part of the S-adenosylmethionine binding site(Ingrosso et al., 1989; Haydock et al., 1991).

[0232] The deletion of desVI should have little polar effect (Lin etal., 1984) on the expression of other desosamine biosynthetic genesbecause the ORF (desR) lying immediately downstream from desVI is notdirectly involved in desosamine formation, and those lying furtherdownstream are transcribed in the opposite direction. Second, sinceN,N-dimethylation is almost certainly the last step in the desosaminebiosynthetic pathway (Liu et al., 1994; Kirschning et al., 1997; Johnsonet al., 1998; Gaisser et al., 1997; Summers et al., 1997), perturbingthis step may lead to the accumulation of a compound of formula (4),which stands the best chance among all other intermediates of beingrecognized by the glycosyltransferase (DesVII) for successful linkage tothe macrolactone of formula (6) (FIG. 25). Deletion and/or disruption ofa single biosynthetic gene often affects the pathway at more than onespecific step. In fact, disruption of eryCVI, the desVI equivalent inthe erythromycin cluster, which has been predicted to encode a similarN-methylase to make desosamine in erythromycin (Gaisser et al., 1997;Summers et al., 1997), led to the accumulation of an intermediate devoidof the entire desosamine moiety (Summers et al., 1997).

[0233] A plasmid pBL3001, in which desVI was replaced by thethiostrepton gene (tsr) (Bibb et al., 1985), was constructed andintroduced into wild type S. venezuelae by conjugal transfer using E.coli S17-1 (Bierman et al., 1992). Two identical double crossovermutants, KdesVI-21 and KdesVI-22 with phenotypes of thiostreptonresistance (Thio^(R)) and apamycin sensitivity (Apm^(S)) were obtained.Southern blot hybridization using tsr or a 1.1 kb HincII fragment fromthe desVII region further confirmed that the desVI gene was indeedreplaced by tsr on the chromosome of these mutants. The KdesVI-21 mutantwas first grown at 29° C. in seed medium (100 mL) for 48 hours, and theninoculated and grown in vegetative medium (3 L) for another 48 hours(Cane et al., 1993). The fermentation broth was centrifuged to removethe cellular debris and mycelia, and the supernatant was adjusted to pH9.5 with concentrated KOH, followed by extraction with chloroform. Nomethymycin or neomethymycin was found; instead, the 10-deoxy-methynolide(6) (350 mg) (Lambalot et al., 1992) and two new macrolides containingan N-acetylated amino sugar, a compound of formula (7) (20 mg) and acompound of formula (8) (15 mg), were isolated. Their structures weredetermined by spectral analyses and high-resolution MS.

[0234] Spectral data of formula 7 are: ¹H NMR (CDCl₃) δ 6.62 (1H, d,J=16.0, H-9), 6.22 (1H, d, J=16.0, H-8), 5.75 (1H, d, J=7.5, N-H), 4.75(1H, dd, J=10.8, 2.2, H-11), 4.28 (1H, d, J=7.5, H-1′), 3.95 (1H, m,H-3′), 3.64 (1H, d, J=10.5, H-3), 3.56 (1H, m, H-5′), 3.16 (1H, dd,J=10.0, 7.5, H-2′), 2.84 (1H, dq, J=10.5, 7.0, H-2), 2.55 (1H, m, H-6),2.02 (3H, s, NAc), 1.95 (1H, m, H-12), 1.90 (1H, m, H-4′), 1.66 (1H, m,H-5), 1.50 (1H, m, H-12), 1.41 (3H, d, J=7.0, 2-Me), 1.40 (1H, m, H-5),1.34 (3H, s, 10-Me), 1.25 (1H, m, H-4), 1.22 (1H, m, H-4′), 1.21 (3H, d,J=6.0, H-6′), 1.17 (3H, d, J=7.0, 6-Me), 1.01 (3H, d, J=6.5, 4-Me), 0.89(3H, t, J=7.2, 12-Me); ¹³C NMR (CDCl₃) δ 204.3 (C-7), 175.1 (C-1), 171.8(Me-C=O), 149.1 (C-9), 125.3 (C-8), 104.4 (C-1′), 85.4 (C-3), 76.3(C-11), 75.4 (C-2′), 74.1 (C-10), 68.6 (C-5′), 51.9 (C-3′), 45.0 (C-6),44.0 (C-2), 38.5 (C-4′), 33.8 (C-5), 33.3 (C-4), 23.1 (Me-C=O), 21.1(C-12), 20.6 (C-6′), 19.2 (10-Me), 17.5 (6-Me), 17.2 (4-Me), 16.2(2-Me), 10.6 (12-Me). High-resolution FABMS: calc for C₂₅H₄₃O₈N(M+H)⁺484.2910, found 484.2903.

[0235] Spectral data of formula 8 are: ¹H NMR (CDCl₃) δ 6.76 (1H, dd,J=16.0, 5.5, H-9), 6.44 (1H, dd, J=16.0, 1.5, H-8), 5.50 (1H, d, J=6.5,N-H), 4.80 (1H, dd, J=9.0, 2.0, H-11), 4.28 (1H, d, J=7.5, H-1′), 3.95(1H, m, H-3′), 3.88 (1H, m, H-12), 3.62 (1H, d, J=11.0, H-3), 3.57 (1H,m, H-5′), 3.18 (1H, dd, J=10.0, 7.5, H-2′), 3.06 (1H, m, H-10), 2.86(1H, dq, J=11.0, 7.0, H-2), 2.54 (1H, m, H-6), 2.04 (3H, s, NAc), 1.98(1H, m, H-4′), 1.67 (1H, m, H-5), 1.40 (1H, m, H-5), 1.39 (3H, d,J=7.0,2-Me), 1.25 (1H, m, H-4), 1.22 (1H, m, H-4′), 1.22 (3H, d, J=6.0,H-6′), 1.21(3H, d, J=6.0, 6-Me), 1.19 (3H, d, J=7.0, 12-Me), 1.16 (3H,d, J=6.5, 10-Me), 1.01(3H, d, J=6.5, 4-Me); ¹³C NMR (CDCl₃) δ 205.1(C-7), 174.6 (C-1), 171.9 (Me-C=O), 147.2 (C-9), 126.2 (C-8), 104.4(C-1′), 85.3 (C-3), 75.7 (C-11), 75.4 (C-2′), 68.7 (C-5′), 66.4 (C-12),52.0 (C-3′), 45.1 (C-6), 43.8 (C-2), 38.6 (C-4′), 35.4 (C-10), 34.1(C-5), 33.4 (C-4), 23.1 (Me-C=O), 21.0 (12-Me), 20.7 (C-6′), 17.7(6-Me), 17.4 (4-Me), 16.1 (2-Me), 9.8 (10-Me). High-resolution FABMS:calc for C₂₅H₄₃O₈N (M+H)⁺484.2910, found 484.2892.

[0236] The fact that compounds of formula (7) and (8) bearing modifieddesosamine are produced by the desVI-deletion mutant is a thrillingdiscovery. However, this result is also somewhat surprising since thesugar component in the products is expected to be the aminodeoxy hexose(4). As illustrated in FIG. 27, it is possible that a compound offormula (7) and (8) are derived from the predicted compound of formula(9) and (10), respectively, by a post-synthetic nonspecific acetylationof the attached aminodeoxy sugar. It is also conceivable thatN-acetylation of (4) occurs first, followed by coupling of the resultingsugar (11) to the 10-deoxymethynolide (6). Nevertheless, the lack ofN-methylation of the sugar component in these new products providesconvincing evidence sustaining the assignment of desVI as theN-methyltransferase gene. Most significantly, the production of acompound of formula (7) and (8) by the desVI-deletion mutant attests tothe fact that the glycosyltransferase (DesVII) inmethymycin/neomethymycin pathway is capable of recognizing andprocessing sugar substrates other than TDP-desosamine (5).

[0237] Since both compounds of formula (7) and (8) are new compoundssynthesized in vivo by the S. venezuelae mutant strain, the observedN-acetylation might be a necessary step for self-protection (Cundliffe,1989). In view of these results, the potential toxicity associated withnew macrolide antibiotics produced by genetically engineeredmicroorganisms can be minimized and newly formed antibiotics that havebeen deactivated (either deliberately or not) during production can beactivated. Such an approach can be part of an overall strategy for thedevelopment of novel antibiotics using the combinatorial biosyntheticapproach. Indeed, purified compounds of formula (7) and (8) are inactiveagainst Streptococcus pyogenes grown on Mueller-Hinton agar plates(Mangahas, 1996), while the controls (a compound of formula (1) and (2))show clearly visible inhibition zones.

[0238] It should be pointed out that a few glycosyltransferases involvedin the biosynthesis of antibiotics have been shown to have relaxedspecificity towards modified macrolactones (Jacobsen et al., 1997;Marsden et al., 1998; Weber et al., 1991). However, a similar relaxedspecificity toward sugar substrates has only been reported for thedaunorubicin glycosyltransferase, which is able to recognize a modifieddaunosamine and catalyze its coupling to the aglycone, ε-rhodomycinone(Madduri et al., 1998). Thus, the fact that the methymycin/neomethymycinglycosyltransferase can also tolerate structural variants of its sugarsubstrate indicates that at least some glycosyltransferases inantibiotic biosynthetic pathways may be useful to create biologicallyactive hybrid natural products via genetic engineering.

Summary

[0239] The appended sugars in macrolide antibiotics are indispensable tothe biological activities of these clinically important drugs.Therefore, the development of new antibiotics via a biologicalcombinatorial approach requires detailed knowledge of the biosynthesisof these unusual sugars, as well as the ability to manipulate thebiosynthetic genes to create novel sugars that can be incorporated intothe final macrolide structures. A targeted deletion of the desVI gene ofStreptomyces venezuelae, which has been predicted to encode anN-methyltransferase based on sequence comparison, was prepared todetermine whether new methymycin/neomethymycin analogues bearingmodified sugars can be generated by altering the desosamine biosyntheticgenes. Growth of the S. venezuelae deletion mutant strain resulted inthe accumulation of a methymycin/neomethymycin analogue carrying anN-acetylated aminodeoxy sugar. Isolation and characterization of thesederivatives not only provide the first direct evidence confirming theidentity of desVI as the N-methyltransferase gene, but also demonstratethe feasibility of preparing novel sugars by the gene deletion approach.Most significantly, the results also revealed that theglycosyltransferase of methymycin/neomethymycin exhibits a relaxedspecificity towards its sugar substrates.

Example 8 Cloning and Sequencing of the Met/Pik Biosynthetic GeneCluster Materials and Methods

[0240] Bacterial Strains and Media. E. coli DH5α was used as a cloninghost. E. coli LE392 was the host for a cosmid library derived from S.venezuelae genomic DNA. LB medium was used in E. coli propagation.Streptomyces venezuelae ATCC 15439 was obtained as a freeze-dried pelletfrom ATCC. Media for vegetative growth and antibiotic production wereused as described (Lambalot et al., 1992). Briefly, SGGP liquid mediumwas for propagation of S. venezuelae mycelia. Sporulation agar (SPA) wasused for production of S. venezuelae spores. Methymycin production wasconducted in either SCM or vegetative medium and pikromycin productionwas performed in Suzuki glucose-peptone medium.

[0241] Vectors, DNA Manipulation and Cosmid Library Construction. pUC119was the routine cloning vector, and pNJ1 was the cosmid vector used forgenomic DNA library construction. Plasmid vectors for gene disruptionwere either pGM160 (Muth et al., 1989) or pKC1139 (Bierman et al.,1992). Plasmid, cosmid, and genomic DNA preparation, restrictiondigestion, fragment isolation, and cloning were performed using standardprocedures (Sambrook et al., 1989; Hopwood et al., 1985). The cosmidlibrary was made according to instructions from the Packageneλ-packaging system (Promega).

[0242] DNA Sequencing and Analysis. An Exonuclease III (ExoIII) nesteddeletion series combined with PCR-based double stranded DNA sequencingwas employed to sequence the pik cluster. The ExoIII procedure followedthe Erase-a-Base protocol (Stratagene) and DNA sequencing reactions wereperformed using the Dye Primer Cycle Sequencing Ready Reaction Kit(Applied Biosystems). The nucleotide sequences were read from an ABIPRISM 377 sequencer on both DNA strands. DNA and deduced proteinsequence analyses were performed using GeneWorks and GCG sequenceanalysis package. All analyses were performed using the specific programdefault parameters.

[0243] Gene Disruption. A replicative plasmid-mediated homologousrecombination approach was developed to conduct gene disruption in S.venezuelae. Plasmids for insertional inactivation were constructed bycloning a kanamycin resistance marker into target genes, and plasmid forgene deletion/replacement was constructed by replacing the target genewith a kanamycin or thiostrepton resistance gene in the plasmid.Disruption plasmids were introduced into S. venezuelae by eitherPEG-mediated protoplast transformation (Hopwood et al., 1985) orRK2-mediated conjugation (Bierman et al., 1992). Then, spores fromindividual transformants or transconjugants were cultured onnon-selective plates to induce recombination. The cycle was repeatedthree times to enhance the opportunity for recombination. Doublecrossovers yielding targeted gene disruption mutants were selected andscreened using the appropriate combination of antibiotics and finallyconfirmed by Southern hybridization.

[0244] Antibiotic Extraction and Analysis. Methymycin, pikromycin, andrelated compounds were extracted following published procedures (Cane etal., 1993). Thin layer chromatography (TLC) was routinely used to detectmethymycin, neomethymycin, narbomycin and pikromycin. Furtherpurification was conducted using flash column chromatography and HPLC,and the purified compounds were analyzed by ¹H, ¹³C NMR spectroscopy andMS spectrometry.

[0245] Results

[0246] Cloning and Identification of the pik Cluster. Heterologoushybridization was used to identify genes for methymycin, neomethymycin,narbomycin and pikromycin biosynthesis in S. venezuelae. InitialSouthern blot hybridization analysis using a type I PKS DNA proberevealed two multifunctional PKS clusters of uncharacterized function inthe genome. Since these four antibiotics are all comprised of anidentical desosamine residue, a tylAI α-D-glucose-1-phosphatethymidylyltransferase DNA probe (for mycaminose/mycorose/mycinosebiosynthesis in the tylosin pathway) (Merson-Davies et al., 1994) wasused to locate the corresponding biosynthetic gene cluster(s). Thisanalysis established that only one of the PKS pathways contained acluster of desosamine biosynthetic genes. Nine overlapping cosmid cloneswere isolated spanning over 80 kilobases (kb) on the bacterialchromosome that encompassed the entire gene cluster pik) for methymycin,neomethymycin, narbomycin and pikromycin biosynthesis (FIG. 28). Throughsubsequent gene disruption, the other PKS cluster (vep, devoid of linkeddesosamine biosynthetic genes) was found to play no role in productionof methymycin, neomethymycin, narbomycin or pikromycin.

[0247] Nucleotide Sequence of the pik Cluster. The nucleotide sequenceof the pik cluster was completely determined and shown to contain 18open reading frames (ORFs) that span approximately 60 kb. Central to thecluster are four large ORFs, pikAI, pikAII, pikAIII, and pikAIV,encoding a multifunctional PKS (FIG. 28). Analysis of the six modulescomprising the pik PKS indicated that it would specify production ofnarbonolide, the 14-membered ring aglycone precursor of narbomycin andpikromycin (FIG. 28).

[0248] Initial analysis unveiled two significant architecturaldifferences in the pikA-encoded PKS. First, compared with eryA (Donadioet al., 1998) and oleA (Swan et al., 1994), two PKS clusters thatproduce 14-membered ring macrolides erythromycin and oleadomycin similarto pikromycin, the presence of separate ORFs, pikAIII and pikAIV,encoding Pik module 5 and Pik module 6 (as individual modules) asopposed to one bimodular protein as in eryAIII and oleAIII is striking.Secondly, the presence of a type II thioesterase immediately downstreamof the type I PKS cluster is also unprecedented (FIG. 28). These twocharacteristics suggest that pikA may produce the 12-membered ringmacrolactone 10-deoxymethynolide as well. Indeed, the domainorganization of PikAI-AIII (module L-5) is consistent with the predictedbiosynthesis of 10-deoxymethynolide except for the absence of a TEfunction at the C-terminus of Pik module 5 (PikAIII). The lack of a TEdomain in PikAIII may be compensated by the type II TE (encoded bypikAV) immediately downstream of pikAIV. Consistent with the suppositionthat two distinct polyketide ring systems are assembled from the pikPKS, two macrolide-lincosamide-streptogramin B type resistant genes,pikR1 and pikR2, are found upstream of the pik PKS (FIG. 29), whichpresumably provide cellular self-protection for S. venezuelae.

[0249] The genetic locus for desosamine biosynthesis and glycosyltransfer are immediately downstream of pikA. Seven genes, desI, desII,desIII, desIV, desV, desVI, and desVIII, are responsible for thebiosynthesis of the deoxysugar, and the eighth gene, desVII, encodes aglycosyltransferase that apparently catalyzes transfer of desosamineonto the alternate (12- and 14-membered ring) polyketide aglycones. Theexistence of only one set of desosamine genes indicates that DesVIII canaccept both 10-deoxymethynolide and narbonolide as substrates (Jacobsenet al., 1997). The largest ORF in the des locus, desR, encodes aβ-glycosidase that is involved in a drug inactivation-reactivation cyclefor bacterial self-protection.

[0250] Just downstream of the des locus is a gene (pikC) encoding acytochrome P450 hydroxylase similar to eryF (Andersen et al., 1992), anderyK (Stassi et al., 1993), PikC, and a gene (pikD) encoding a putativeregulator protein, PikD (FIG. 28). Interestingly, PikC is the only P450hydroxylase identified in the entire pik cluster, suggesting that theenzyme can accept both 12- and 14-membered ring macrolide substratesand, more remarkably, it is active on both C-10 and C-12 of the YC-17(12-membered ring intermediate) to produce methymycin and neomethymycin(FIG. 30). PikD is a putative regulatory protein similar to ORFH in therapamycin gene cluster (Schwecke et al., 1995).

[0251] The combined functionality coded by the eighteen genes in the pikcluster predicts biosynthesis of methymycin, neomethymycin, narbomycinand pikromycin (Table 2). Flanking the pik cluster locus are genespresumably involved in primary metabolism and genes that may be involvedin both primary and secondary metabolism. An S-adenosyl-methioninesynthase gene is located downstream of pikD that may help to provide themethyl group in desosamine synthesis. A threonine dehydratase gene wasidentified upstream of pikR1 that may provide precursors for polyketidebiosynthesis. It is not apparent that any of these genes are dedicatedto antibiotic biosynthesis and they are not directly linked to the pikcluster. TABLE 2 Deduced function of ORFs in the pik cluster AminoPolypeptide (ORF) acids, no. Proposed function or sequence similaritydetected PikAI 4,613 PKS Loading module KS^(Q) AT(P) ACP Module 1 KSAT(P) KR ACP Module 2 KS AT(A) DH KR ACP PikAII 3,739 PKS Module 3 KSAT(P) KR⁰ ACP Module 4 KS AT(P) DH ER KR ACP PikAIII 1,562 PKS Module 5KS AT(P) KR ACP PikAIV 1,346 PKS Module 6 KS AT(P) ACP TE PikAV 281Thioesterase II (TEII) DesI 415 4-Dehydrase DesII 485 Reductase? DesIII292 α-D-Glucose-1-phosphate thymidylyltransferase DesIV 337 TDP-glucose4,6-dehydratase DesV 379 Transaminase DesVI 237 N,N-dimethyltransferaseDesVII 426 Glycosyl transferase DesVIII 402 Tautomerase? DesR 809β-Glucosidase (involved in resistance mechanism) PikC 418 P450hydroxylase PikD 945? Putative regulator PikR1 336 rRNAmethyltransferase (mls resistance) PikR2 288? rRNA methyltransferase(mls resistance)

[0252] TABLE 3 Summary of mutational analyses of the pik clusterAntibiotic production/ Type of Target Intermediate accumulation Mutantmutation gene Met & neomethymycin Pikromycin AX903 Insertion pikAI No/NoNo/No LZ3001 Deletion/ desVI No/10-deoxymethynolide No/narbonolidereplacement LZ4001 Deletion/ desV No/10-deoxymethynolide No/narbonolidereplacement AX905 Deletion/ pikAV <5%/No <5%/No replacement AX906Insertion pikC No/YC-17 No/narbomycin

[0253] Mutational Analysis of the pik Cluster. Extensive disruption ofgenes in the pik cluster were carried out to address the role of keyenzymes in antibiotic production (Table 3). First, PikAI, the firstputative enzyme involved in the biosynthesis of 10-deoxymethynolide andnarbonolide was inactivated by insertional mutagenesis. The resultingmutant, AX903, produced neither methymycin or neomethymycin, nornarbomycin or pikromycin, indicating that pikA encodes a PKS requiredfor both 12- and 14-membered ring macrolactone formation.

[0254] Second, deletion of both desVI and desV abolished methymycin,neomethymycin, narbomycin and pikromycin production, and the resultingmutants, LZ3001 and LZ4001, accumulate 10-deoxymethynolide andnarbonolide in their culture broth, indicating that enzymes fordesosamine synthesis and transfer are also shared by the 12- and14-membered ring macrolides.

[0255] In order to understand the mechanism of polyketide chaintermination at PikAII (PIKAIII (module 5) is presumed to be thetermination point in construction of 10-deoxymethynolide), the pik TEIIgene, pikAV, was deleted. The deletion/replacement mutant, AX905,produces less than 5% of methymycin, neomethymycin, and less than 5% ofpikromycin compared to wild type S. venezuelae. This abrogation inproduct formation occurs without significant accumulation of theexpected aglycone intermediates, suggesting that pik TEII is involved inthe termination of 12- as well as 14-membered ring macrolides at PikAIIIand PikAIV, respectively. Although the polar effects may influence theobserved phenotype in AX905, this has been ruled out after theconsideration of mutant LZ3001, in which mutation in an enzymedownstream of pikAV accumulated 10-deoxymethynolide and narbonolide. Thefact that mutant AX905 failed to accumulate these intermediatessuggested that the polyketide chains were not efficiently released fromthis PKS protein in the absence of Pik TEII. Therefore, Pik TEII plays acrucial role in polyketide chain release and cyclization, and itpresumably provides the mechanism for alternative termination in pikpolyketide biosynthesis.

[0256] Finally, disruption of pikC confirmed that PikC is the soleenzyme catalyzing hydroxylation of both YC-17 (at C-10 and C-12) andnarbomycin (at C-12). The relaxed substrate specificity of PikC and itsregional specificity at C-10 and C-12 provide another layer ofmetabolite diversity in the pik-encoded biosynthetic system.

[0257] Discussion

[0258] The work described herein has established that methymycin,neomethymycin, narbomycin and pikromycin biosynthesis is encoded by thepik cluster in S. venezuelae. Three key enzymes as well as the uniquearchitecture of the cluster enable this relatively compact system toproduce multiple macrolide antibiotics. Foremost, the presence of pikmodule 5 and 6 as separate proteins, PikAIII and PikAIV, and theactivity of pik TEII enable the bacterium to terminate the polyketidechain at two different points of assembly, thereby producing twomacrolactones of different ring size. Second, DesVII, theglycosyltransferase in the pik cluster, can accept both 12- and14-membered ring macrolactones as substrates. Finally, PikC, the P450hydroxylase, has a remarkable substrate and regiochemical specificitythat introduces another layer of diversity into the system.

[0259] It is interesting to consider that pikA evolved in a lineanalogous to eryA and oleA since each of these PKSs specify thesynthesis of 14-membered ring macrolactones. Therefore, pik may haveacquired the capacity to generate methymycin when a mutation in theprimordial pikAIII-pikAIV linker region caused splitting of Pik module 5and 6 into two separate gene products. This notion is raised by twofeatures of the nucleotide sequence. First, the intergenic regionbetween pikAIII and pikAIV, which is 105 bp, may be the remanent of anintramodular linker peptide of 35 amino acids. Moreover, the potentialfor independently regulated expression of pikAIV is implied by thepresence of a 100 nucleotide region at the 5′ end of the gene that isrelatively AT-rich (62% as comparing 74% G+C content in coding region).Thus, as the mutation in an original ORF encoding the bimodularmultifunctional protein (PikAIII-PikAIV) occurred, so too may haveevolved a mechanism for regulated synthesis of the new gene product(PikAIV).

[0260] The role of Pik TEII in alternative termination of polyketidechain elongation intermediates provides a unique aspect of diversitygeneration in natural product biosynthesis. Engineered polyketides ofdifferent chain length are typically generated by moving the TEcatalytic domain to alternate positions in a modular PKS (Cortes et al.,1995). Repositioning of the TE domain necessarily abolishes productionof the original full-length polyketide so only one macrolide is producedeach time. In contrast to the fixed-position TE domain, the independentPik TEII polypeptide presumably has the flexibility to catalyzetermination at different stages of polyketide assembly, thereforeenabling the system to produce multiple products of variant chainlength. Combinatorial biology technologies can now exploit this systemfor generating molecular diversity through construction of novel PKSsystems with TEIIs for simultaneous production of several new moleculesas opposed to the TE domains alone that limit catalysis to a singletermination step.

[0261] It is noteworthy that sequences similar to Pik TEII are found inalmost all known polyketide and non-ribosomal polypeptide biosyntheticsystems (Marahiel et al., 1997). Currently, the pik TEII is the first tobe characterized in a modular PKS. However, recent work on a TEII genein the lipopeptide surfactin biosynthetic cluster (Schneider et al.,1998) demonstrated that srf-TEII plays an important role in polypeptidechain release, and may suggest that srf-TEII reacts at multiple stagesin peptide assembly as well (Marahiel et al., 1997).

[0262] The enzymes involved in post-polyketide assembly of10-deoxymethynolide and narbonolide are particularly intriguing,especially the glycosyltransferase, DesVII, and P450 hydroxylase, PikC.Both have the remarkable ability to accept substrates with significantstructural variability. Moreover, disruption of desVI demonstrated thatDesVII also tolerates variations in deoxysugar structure (Example 6).Likewise, PikC has recently been shown to convert YC-17 tomethymycin/neomethymycin and narbomycin to pikromycin in vitro.

[0263] Targeted gene disruption of ORF1 abolished both pikromycin andmethymycin production, indicating that the single cluster is responsiblefor biosynthesis of both antibiotics. Deletion of the TE2 genesubstantially reduced methymycin and pikromycin production, whichdemonstrates that TE2, in contrast to the position-fixed TE1 domain, hasthe capacity to release polyketide chain at different points during theassembly process, thereby producing polyketides of different chainlength.

[0264] The results described above were unexpected in that it wassurprising that one PKS cluster produces two macrolides which differ inthe number of atoms in their ring structure, that module 5 and module 6of the PKS are in ORFs that are separated by a spacer region, thatPikAIII lacked TE, that there was a Type II thioesterase, that TEIdomain was not separate, and that 2 resistance genes were identifiedwhich may be specific for either a 12- or 14-membered ring.

[0265] With eighteen genes spanning less than 60 kb of DNA capable ofproducing four active macrolide antibiotics, the pik cluster representsthe least complex yet most versatile modular PKS system so farinvestigated. This simplicity provides the basis for a compellingexpression system in which novel active ketoside products are engineeredand produced with considerable facility for discovery of a diverse rangeof new biologically active compounds.

[0266] Summary

[0267] Complex polyketide synthesis follows a processive reactionmechanism, and each module within a PKS harbors a string of three to sixenzymatic domains that catalyze reactions in nearly linear order asdescribed in particular detail for the erythromycin-producing PKS (Katz,1997; Khosla, 1997; Staunton et al. 1997). The combined set of PKSmodules and catalytic domains along with genes that encode enzymes forpost-polyketide tailoring (e.g., glycosyl transferases, hydroxylases)typically limits a biosynthetic system to the generation of a singlepolyketide product.

[0268] Combinatorial biology involves the genetic manipulation ofmultistep biosynthetic pathways to create molecular diversity in naturalproducts for use in novel drug discovery. PKSs represent one of the mostamenable systems for combinatorial technologies because of theirinherent genetic organization and ability to produce polyketidemetabolites, a large group of natural products generated by bacteria(primarily actinomycetes and myxobacteria) and fungi with diversestructures and biological activities. Complex polyketides are producedby multifunctional PKSs involving a mechanism similar to long-chainfatty acid synthesis in animals (Hopwood et al., 1990). Pioneeringstudies (Cortes et al., 1990; Donadio et al., 1991) on the erythromycinPKS in Saccharopolyspora erythraea revealed a modular organization.Characterization of this multidomain protein system, followed bymolecular analysis of rapamycin (Aparicio et al., 1996), FK506 (Motamediet al., 1997), soraphen A (Schupp et al., 1995), niddamycin (Kakavas etal., 1997), and rifamycin (August et al., 1998) PKSs, demonstrated aco-linear relationship between modular structure of a multifunctionalbacterial PKS and the structure of its polyketide product.

[0269] In a survey of microbial systems capable of generating unusualmetabolite structural variability, Streptomyces venezuelae ATCC 15439 isnotable in its ability to produce two distinct groups of macrolideantibiotics. Methymycin and neomethymycin are derived from the12-membered ring macrolactone 10-deoxymethynolide, while narbomycin andpikromycin are derived from the 14-membered ring macrolactone,narbonolide. The cloning and characterization of the biosynthetic genecluster for these antibiotics reveals the key role of a type IIthioesterase in forming a metabolic branch through which polyketides ofdifferent chain length are generated by the pikromycin multifunctionalpolyketide synthase (PKS). Immediately downstream of the PKS genes(pikA) are a set of genes for desosamine (des) biosynthesis andmacrolide ring hydroxylation. The glycosyl transferase (encoded bydesVIII) has the remarkable ability to catalyze glycosylation of boththe 12- and 14-membered ring macrolactones. Moreover, the pikC-encodedP450 hydroxylase provides yet another layer of structural variability byintroducing regiochemical diversity into the macrolide ring systems.

Example 9 Strategies Employing Modular PKS as PHA Monomer Providers

[0270] One strategy to exploit modular PKSs, e.g., modules of pikA or aFAS, to provide PHA monomers is to harvest polyketide intermediates asCoA derivatives using a TEII which is converted to an acyl-CoAtransferase (mTEII). PikTEII is a small enzyme (281 amino acids) encodedby pikAV in S. venezuelae. The primary function of the wild-type enzymeis to catalyze the release of a polyketide chain at the fifth module inthe pikA pathway as 10-deoxymethonolide. The enzyme most likely binds tothe fifth module (PikAIII) ACP (ACP5) and releases the acyl chainattached to it. This relationship, TEII and its cognate ACP5, can beexploited to produce a polyketide having different chain lengths bymoving Pik ACP5 to a different position in the cluster. For example, bymoving ACP5 into the second module in place of ACP2, a triketide insteadof hexoketide may be produced by the cluster. Further, moving KR5together with ACP5 into the second module, and replacing the DH, KR, andACP domains, a 3-hydroxyl triketide is produced that is structurallysuitable as PHA monomer. A mutant TEII (mTEII) catalyzes the release ofthe triketide as CoA form. The triketide-CoA,3,5-dihydroxyl-4-methyl-heptonyl-CoA, is a substrate for PHA polymerase,e.g., PhaC1 from P. olivarus, which, in turn, can incorporate themonomer into a polymer.

[0271] A second strategy includes the harvesting of a polyketideintermediate as a CoA derivative using a TEI which has been converted toan acyl-CoA transferase (mTE). Thus, the second strategy for3-hydroxyacyl-CoA monomer production is to exploit the TE domain (TEI)within the PKS module. It has been demonstrated that the TE domain canrelease polyketide intermediates attached to the ACP domain within thesame module. Moving the TEI to a different position in a PKS clusterresults in the production of a polyketide having a different chainlength. Similarly, a mutant TEI (mTEI) (i.e., one which is an acyl-CoAtransferase) releases the polyketide intermediate to acyl-CoA, whichthen is polymerized by PHA synthetase. Preferably, a mutant TE domain inthe pikA gene cluster is moved into pik module 1, fusing it immediatelydownstream of ACP1. The recombinant enzyme produces2-(S)-methyl-3(R)-hydroxylveleratyl-CoA, which is a suitable substratefor PHA polymerase PhaC1. Therefore, the coexpression of the polymerasewith the recombinant PKS produces a polymer.

[0272] A third strategy is to directly collect polyketide intermediatesas substrates for PHA synthesis by fusing a PHA polymerase with apolyketide synthase. The first two strategies produce 3-hydroxylacyl-CoAas a substrate for PHA synthesis by employing a mutant PKS enzyme (TEIor TEII). As PHA polymerase may be active on acyl-ACP itself if theacyl-ACP is properly oriented, the third strategy fuses a PHA polymerasedownstream of an ACP in a PKS protein. The PHA synthetase then serves asa domain within the chimeric multifunctional enzyme in place of a TEdomain. The PKS portion of the protein catalyzes the synthesis of a3-hydroxylacyl-ACP intermediate and then the PHA synthetase domainaccepts it as substrate and adds the 3-hydroxylacyl monomer to thegrowing polyhydroxyalkanoate chain. The process regenerates ACP functionso that the reaction can go on repeatedly to synthesize a PHA ofmultiple units. For example, a phaC1 gene is fused directly downstreamof pik ACP1 so as to produce a chimeric enzyme that catalyzes thesynthesis of a polymer.

[0273] The strategies described above can produce PHAs of complexstructure, and having superior properties. In addition, the structurecan be easily fine-tuned by modifying the PKS gene, thus resulting inPHAs having desired properties or functions.

REFERENCES

[0274] Andersen, J. R., Hutchinson, C. R. J. Bacteriol., 174:725-735(1992).

[0275] Aparicio, J. F., Molnar, I., Schwecke, T., Konig, A., Haydock, S.F., Khaw, L. E., Staunton, J., Leadlay, P. F. Gene, 16:9-16 (1996).

[0276] Arisawa, A., Kawamura, N., Takeda, K., Tsunekawa, H., Okamura,K., Okamoto, R. Appl. Environ. Microbiol., 60:2657-2660 (1994).

[0277] August, P. R., Tang, L., Yoon, Y. J., Ning, S., Muller, R., Yu,T. W., Taylor, M., Hoffmann, D., Kim, C. G., Zhang, X., Hutchinson, C.R. & Floss, H. G. Chem. Biol., 5:69-79 (1998).

[0278] Baltz, R. H., Seno, E. T. Annu. Rev. Microbiol., 42:547-574(1988).

[0279] Bibb, M. J., Bibb, M. J., Ward, J. M., Cohen, S. N. Mol. Gen.Genet., 199:26-36 (1985).

[0280] Bierman, M., Logan, R., O'Brien, K., Seno, G., Nagaraja, R.,Schoner, B. E. Gene, 116:43-49 (1992).

[0281] Box, R. P. Clin. Infect. Dis., 24:S151 (1997).

[0282] Cane, D. E., Lambalot, R. H., Prabhakaran, P. C., Ott, W. R. J.Am. Chem. Soc., 115:522-526 (1993).

[0283] Carreras, C. W., Pieper, R., Khosla, C. In Bioorganic ChemistryDeoxysugars, Polyketides & Related Classes: Synthesis, Biosynthesis,Enzymes, Rohr, J. (ed.), Springer:Berlin, 85-126 (1997).

[0284] Castle, L. A., Smith, K. D., Morris, R. O. J. Bacteriol.,174:1478-1486 (1992).

[0285] Celmer, W. D., Nagel, A. A., Wadlow, J. W., Tatematsu, H.,Ikenaga, S., Nakanishi, S. Abstracts of Papers of 24th Intersci. Conf.on Antimicrob. Agents Chemother., No. 1142, Washington, D. C. (1985).

[0286] Cortes, J. Haydock, S. F., Roberts, G. A., Bevitt, D. J.,Leadlay, P. F. Nature, 348:176-8 (1990).

[0287] Cortes, J., Wiesmann, K. E., Roberts, G. A., Brown, M. J.,Staunton, J., Leadlay, P. F. Science, 268:1487-9 (1995).

[0288] Cundliffe, E. C. Annu. Rev. Microbiol., 43:207-233 (1989).

[0289] Cundliffe, E. Antimicrob. Agents Chemother., 36:348-352 (1992).

[0290] Davies, J. Nature, 383:219-220 (1996).

[0291] Djerassi, C., Zderic, J. A. J. Am. Chem. Soc., 78:6390-6395(1956).

[0292] Donadio, S., McAlpine, J. B., Sheldon, P. J., Jackson, M., Katz,L. Proc. Natl. Acad. Sci. U.S.A., 90:7119-7123 (1993).

[0293] Donadio, S., Staver, M. J., McAlpine, J. B., Swanson, S. J.,Katz, L. Science, 252:675-9 (1991).

[0294] Donadio, S., Katz, L. Gene, 111:51-60 (1992).

[0295] Donin, M. N., Pagano, J., Dutcher, J. D., McKee, C. M.Antibiotics Annu., 1:179-185 (1953-1954).

[0296] Epp, J., Huber, M. L. B., Tuner, J. R., Goodson, T., Schoner, B.E. Gene, 85:293-301 (1989).

[0297] Flinn, E. H., Sigal, M. V., Jr., Wiley, P. F., Gerzon, K. J. Am.Chem. Soc., 76:3121-3131 (1954).

[0298] Gaisser, S., Bohm, G. A., Cortés, J., Leadlay, P. F. Mol. Gen.Genet., 256:239-251 (1997).

[0299] Gandecha, A. R., Large, S. L., Cundliffe, E. Gene, 184:197-203(1997).

[0300] Geistlich, M., Losick, R., Turner, J. R., Rao, R. N. Mol.Microbiol., 6:2019-2029 (1992).

[0301] Haydock, S. F., Dowson, J. A., Dhillon, N., Roberts, G. A.,Cortés, J., Leadlay, P. F. Mol. Gen. Genet., 230:120-128 (1991).

[0302] Hernandez, C., Olano, C., Mendez, C., Salas, J. A. Gene,134:139-140 (1993).

[0303] Hopwood, D. A., Sherman, D. H. Annu. Rev. Genet., 24:37-66(1990).

[0304] Hopwood, D. A., Malpartida, F., Kieser, H. M., Ikeda, H., Duncan,J., Fujii, I., Rudd, B. A., Floss, H. G., Omura, S. Nature, 314:642-644(1985).

[0305] Hopwood, D. A., Bibb, M. J., Chater, K. J., Kieser, T., Bruton,C. J., Kieser, H. M., Lydiate, D. J., Smith, C. P., Ward, J. M.,Schrempf, H., Genetic Manipulation of Streptomyces: A Laboratory Manual(The John limes Foundation) (1985).

[0306] Hori et al., Chem. Comm., 304 (1971).

[0307] Hutchinson, C. R., Fujii, I. Annu. Rev. Microbiol., 49:201-238(1995).

[0308] Ingrosso, D., Fowler, A. V., Bleibaum, J., Clarke, S. J. Biol.Chem., 264:20130-20139 (1989).

[0309] Jacobsen, J. R., Hutchinson, C. R., Cane, D. E., Khosla, C.Science, 277:367-369 (1997).

[0310] Jenksins, G., Cundliffe, E. Gene, 108 55-62 (1991).

[0311] Kakavas, S. J., Katz, L., Stassi, D. J. Bacteriol., 179:7515-22(1997).

[0312] Katz, L., Donadio, S. Annu. Rev. Microbiol., 47:875-912 (1993).

[0313] Katz, L., Chem. Rev., 97:2557-2575 (1997).

[0314] Khosla, C., Chem. Rev., 97:2577-2590 (1997).

[0315] Khosla, C., Zawada, R. J. Trends Biotechnol., 14:335-341 (1996).

[0316] Kirschning, A., Bechthold, A. F.-W., Rohr, J. In BioorganicChemistry Deoxysugars, Polyketides & Related Classes: Synthesis,Biosynthesis, Enzymes, Rohr, J. (ed.), Springer:Berlin 1-84 (1997).

[0317] Kramer, P. J., Khosla, C. Annu. N.Y. Acad. Sci., 799:32-45(1996).

[0318] Kuo, M.-S., Chirby, D. G., Argoudelis, A. D., Cialdella, J. I.,Coats, J. H., Marshall, V. P. Antimicrob., Agents Chemother.,33:2089-2091 (1989).

[0319] Lambalot, R. H., Cane, D. E. J. Antibiot., 45:1981-1982 (1992).

[0320] Lin, E. C. C., Goldstein, R., Syvanen, M. Bacteria, Plasmids, andPhages, An Introduction to Molecular Biology, Harvard UniversityPress:Cambridge, p. 123 (1984).

[0321] Liu, H.-w., Thorson, J. S. Annu. Rev. Microbiol., 48:223-256(1994).

[0322] Madduri, K., Kennedy, J., Rivola, G., Inventi-Solari, A.,Filippini, S., Zanuso, G., Colombo, A. L., Gewain, K. M., Occi, J. L.,MacNeil, D. J., Hutchinson, C. R. Nature Biotech., 16:69-74 (1998).

[0323] Mangahas, F. R. MS Thesis, University of Minnesota, 1996.

[0324] Marahiel, M. A., Stachelhaus, T., Mootz, H. D., Chem. Rev.,97:2651-2673 (1997).

[0325] Marsden, A. F. A., Wilkinson, B., Cortés, J., Dunster, N.J.,Staunton, J., Leadlay, P. F. Science, 279:199-201 (1998).

[0326] Merson-Davies, L. A., Cundliffe, E. Mol. Microbiol., 13:349-355(1994).

[0327] Merson-Davies, L. A., Cundliffe, E. Mol. Microbiol., 13:347-355(1994).

[0328] Motamedi, H., Cai, S. J., Shafiee, A., Elliston, K. O. Eur. J.Biochem., 244:74-80 (1997).

[0329] Muth, G., Nubhaumer, B., Wohlleben, W., Puhler, A. Mol. Gene.Genet., 219:341-348 (1989).

[0330] Niemi, J., Mantsala, P. J. Bacteriol., 177:2942-2945 (1995).

[0331] Omura, S. (ed.) Macrolide Antibiotics, Chemistry Biology andPractice, Academic Press:New York (1984).

[0332] Omuras et al., J. Antibio., 29, 316 (1971).

[0333] Sambrook, J., Fritsch, E. F., Maniatis, T. Molecular Cloning: ALaboratory Manual (Cold Spring Harbor Laboratory Press), 2nd edition(1989).

[0334] Sasaki, J., Mizoue, K., Morimoto, S., Omura, S. J. Antibiotics,49:1110-1118 (1996).

[0335] Schneider, A., Marahiel, M. A., Arch. Microbiol., 169:404-410(1998).

[0336] Schupp, T., Toupet, C., Cluzel, B., Neff, S., Hill, S., Beck, J.J., Ligon, J. M., J. Bacteriol., 177:3673-9 (1995).

[0337] Schwecke, T., Aparicio, J. F., Molnar, I., Konig, A., Khaw, L.E., Haydock, S. F., Oliynyk, M., Caffrey, P., Cortes, J., Lester, J. B.,et al. Proc. Natl. Acad. Sci. U.S.A., 92:7839-7843 (1995).

[0338] Seo, S., Tomita, Y., Tori, K., Yoshimura, Y. J. Am. Chem. Soc.,100:3331-3339 (1978).

[0339] Service, R. F. Science, 270:724-727 (1995).

[0340] Stassi, D., Donadio, S., Staver, M. J., Katz, L. J. Bacteriol.,175:182-189 (1993).

[0341] Staunton, J., Wilkinson, B., Chem. Rev., 97:2611-2629 (1997).

[0342] Summers, R. G., Donadio, S., Staver, M. J., Wendt-Pienkowski, E.,Hutchinson, C. R., Katz, L. Microbiology, 143:3251-3262 (1997).

[0343] Swan, D. G., Rodriguez, A. M., Vilches, C., Mendez, C., Salas, J.A. Mol. Gen. Genet., 242:358-362 (1994).

[0344] Tuan, J. S., Weber, J. M., Staver, M. J., Leung, J. O., Donadio,S., Katz, L. Gene, 90:21-29 (1990).

[0345] Vilches, C., Hernandez, C., Mendez, C., Salas, J. A. J.Bacteriol., 174:161-165 (1992).

[0346] von Heijne, G. Nucleic Acids Res., 14:4683-4690 (1986).

[0347] von Heijne, G., Abrahmsen, L. FEBS Lett., 244:439-446 (1989).

[0348] Weber, J. M., Leung, J. O., Swanson, S. J., Idler, K. B.,McAlpine, J. B. Science, 252:114-117 (1991).

[0349] The complete disclosure of all patents, patent documents andpublications cited herein are incorporated herein by reference as ifindividually incorporated. The foregoing detailed description andexamples have been given for clarity of understanding only. Nounnecessary limitations are to be understood therefrom. The invention isnot limited to the exact details shown and described for variationsobvious to one skilled in the art will be included within the inventiondefined by the claims.

1 43 1 15872 DNA Streptomyces venezuelae 1 ttaattaagg aggaccatcatgaacgaggc catcgccgtc gtcggcatgt cctgccgcct 60 gccgaaggcc tcgaacccggccgccttctg ggagctgctg cggaacgggg agagcgccgt 120 caccgacgtg ccctccggccggtggacgtc ggtgctcggg ggagcggacg ccgaggagcc 180 ggcggagtcc ggtgtccgccggggcggctt cctcgactcc ctcgacctct tcgacgcggc 240 cttcttcgga atctcgccccgtgaggccgc cgccatggac ccgcagcagc gactggtcct 300 cgaactcgcc tgggaggcgctggaggacgc cggaatcgtc cccggcaccc tcgccggaag 360 ccgcaccgcc gtcttcgtcggcaccctgcg ggacgactac acgagcctcc tctaccagca 420 cggcgagcag gccatcacccagcacaccat ggcgggcgtg aaccggggcg tcatcgccaa 480 ccgcgtctcg taccacctcggcctgcaggg cccgagcctc accgtcgacg ccgcgcagtc 540 gtcctcgctc gtcgccgtgcacctggcctg cgagtccctg cgcgccgggg agtccacgac 600 ggcgctcgtc gccggcgtgaacctcaacat cctcgcggag agcgccgtga cggaggagcg 660 cttcggtgga ctctccccggacggcaccgc ctacaccttc gacgcgcggg ccaacggatt 720 cgtccggggc gagggcggcggagtcgtcgt actcaagccg ctctcccgcg ccctcgccga 780 cggcgaccgt gtccacggcgtcatccgcgc cagcgccgtc aacaacgacg gagccacccc 840 gggtctcacc gtgcccagcagggccgccca ggagaaggtg ctgcgcgagg cgtaccggaa 900 ggcggccctg gacccgtccgccgtccagta cgtcgaactc cacggcaccg gaacccccgt 960 cggcgacccc atcgaggccgccgcgctcgg cgccgtcctc ggctcggcgc gccccgcgga 1020 cgaacccctg ctcgtcggctcggccaagac gaacgtcggg cacctcgaag gcgccgccgg 1080 catcgtcggc ctcatcaagacgctcctcgc gctcggccgg cgccggatcc cggcgagcct 1140 caacttccgt acgccccacccggacatccc gctcgacacc ctcgggctcg acgtgcccga 1200 cggcctgcgg gagtggccgcacccggaccg cgaactcctc gccggcgtca gctcgttcgg 1260 catgggcggc accaacgcccacgtcgtcct cagcgaaggc cccgcccagg gcggcgagca 1320 gcccggcatc gatgaggagacccccgtcga cagcggggcc gcactgccct tcgtcgtcac 1380 cggccgcggc ggcgaggccctgcgcgccca ggcccggcgc ctgcacgagg ccgtcgaagc 1440 ggacccggag ctcgcgcccgccgcactcgc ccggtcgctg gtcaccaccc gtacggtctt 1500 cacgcaccgg tcggtcgtcctcgccccgga ccgcgcccgc ctcctcgacg gcctcggcgc 1560 cctcgccgcc gggacgcccgcgcccggcgt ggtcaccggc acccccgccc ccgggcgcct 1620 cgccgtcctg ttcagcggccagggtgccca acgtacgggc atgggcatgg agttgtacgc 1680 cgcccacccc gccttcgcgacggccttcga cgccgtcgcc gccgaactgg accccctcct 1740 cgaccggccc ctcgccgaactcgtcgcggc gggcgacacc ctcgaccgca ccgtccacac 1800 acagcccgcg ctcttcgccgtggaggtcgc cctccaccgc ctcgtcgagt cctggggcgt 1860 cacgcccgac ctgctcgccggccactccgt cggcgagatc agcgccgccc acgtcgccgg 1920 ggtcctgtcg ctgcgcgacgccgcccgcct cgtcgcggcg cgcggccgcc tcatgcaggc 1980 gctccccgag ggcggcgcgatggtcgcggt cgaggcgagc gaggaggaag tgcttccgca 2040 cctcgcggga cgcgagcgggagctctccct cgcggccgtg aacggccccc gcgcggtcgt 2100 cctcgcgggc gccgagcgcgccgtcctcga cgtcgccgag ctgctgcgcg aacagggccg 2160 ccggacgaag cggctcagcgtctcgcacgc cttccactcg ccgctcatgg agccgatgct 2220 cgacgacttc cgccgggtcgtcgaagagct ggacttccag gagccccgcg tcgacgtcgt 2280 gtccacggtg acgggcctgcctgtcacagc gggccaatgg accgatcccg agtactgggt 2340 ggaccaggtc cgcaggcccgtacgcttcct cgacgccgta cgcaccctgg aggaatcggg 2400 cgccgacacc ttcctggagctcggtcccga cggggtctgc tccgcgatgg cggcggactc 2460 cgtacgcgac caggaggccgccacggcggt ctccgccctg cgcaagggcc gcccggagcc 2520 ccagtcgctg ctcgccgcactcaccaccgt cttcgtccgg ggccacgacg tcgactggac 2580 cgccgcgcac gggagcaccggcacggtcag ggtgcccctg ccgacctacg ccttccagcg 2640 cgaacgccac tggttcgacggcgccgcgcg aacggcggcg ccgctcacgg cgggccgatc 2700 gggcaccggt gcgggcaccggcccggccgc gggtgtgacg tcgggcgagg gcgagggcga 2760 gggcgagggc gcgggtgcgggtggcggtga tcggccggct cgccacgaga cgaccgagcg 2820 cgtgcgcgca cacgtcgccgccgtcctcga gtacgacgac ccgacccgcg tcgaactcgg 2880 cctcaccttc aaggagctgggcttcgactc cctcatgtcc gtcgagctgc ggaacgcgct 2940 cgtcgacgac acgggactgcgcctgcccag cggactgctc ttcgaccacc cgacgccgcg 3000 cgccctcgcc gcccacctgggcgacctgct caccggcggc agcggcgaga ccggatcggc 3060 cgacgggata ccgcccgcgaccccggcgga caccaccgcc gagcccatcg cgatcatcgg 3120 catggcctgc cgctaccccggcggcgtcac ctcccccgag gacctgtggc ggctcgtcgc 3180 cgaggggcgc gacgccgtctcggggctgcc caccgaccgc ggctgggacg aggacctctt 3240 cgacgccgac cccgaccgcagcggcaagag ctcggtccgc gagggcggat tcctgcacga 3300 cgccgccctg ttcgacgccggcttcttcgg gatatcgccc cgcgaggccc tcggcatgga 3360 cccgcagcag cggctgctcctggagacggc atgggaggcc gtggagcgcg cagggctcga 3420 ccccgaaggc ctcaagggcagccggacggc cgtcttcgtc ggcgccaccg ccctggacta 3480 cggcccgcgc atgcacgacggcgccgaggg cgtcgagggc cacctcctga ccgggaccac 3540 gcccagcgtg atgtcgggccgcatcgccta ccagctcggc ctcaccggtc ctgcggtcac 3600 cgtcgacacg gcctgctcgtcctcgctcgt cgcgctgcac ctggccgtcc gttcgctgcg 3660 gcagggcgag tcgagcctcgcgctcgccgg cggagcgacc gtcatgtcga caccgggcat 3720 gttcgtcgag ttctcgcggcagcgcggcct cgccgccgac ggccgctcca aggccttctc 3780 cgactccgcc gacggcacctcctgggccga gggcgtcggc ctcctcgtcg tcgagcggct 3840 ctcggacgcc gagcgcaacggccaccccgt gctcgccgtg atccggggca gcgcggtcaa 3900 ccaggacggc gcctccaacgggctcaccgc ccccaacggc ccgtcccagc agcgcgtcat 3960 ccgacaggcc ctggccgacgccgggctcac cccggccgac gtcgacgccg tcgaggcgca 4020 cggtacgggt acccggctcggcgaccccat cgaggccgag gcgatcctcg gcacctacgg 4080 ccgggaccgg ggcgagggcgctccgctcca gctcggctcg ctgaagtcga acatcggcca 4140 cgcgcaggcc gccgcgggcgtgggcgggct catcaagatg gtcctcgcga tgcgccacgg 4200 cgtcctgccc aggacgctccacgtggaccg gcccaccacc cgcgtcgact gggaggccgg 4260 cggcgtcgag ctcctcaccgaggagcggga gtggccggag acgggccgcc cgcgccgcgc 4320 ggcgatctcc tccttcggcatcagcggcac caacgcccac atcgtggtcg aacaggcccc 4380 ggaagccggg gaggcggcggtcaccaccac cgccccggaa gcaggggaag ccggggaagc 4440 ggcggacacc accgccaccacgacgccggc cgcggtcggc gtccccgaac ccgtacgcgc 4500 ccccgtcgtg gtctccgcgcgggacgccgc cgccctgcgc gcccaggccg ttcggctgcg 4560 gaccttcctc gacggccgaccggacgtcac cgtcgccgac ctcggacgct cgctggccgc 4620 ccgtaccgcc ttcgagcacaaggccgccct caccaccgcc accagggacg agctgctcgc 4680 cgggctcgac gccctcggccgcggggagca agccacgggc ctggtcaccg gcgaaccggc 4740 cagggccgga cgcacggccttcctgttcac cggccaggga gcgcagcgcg tcgccatggg 4800 cgaggaactg cgcgccgcgcaccccgtgtt cgccgccgcc ctcgacaccg tgtacgcggc 4860 cctcgaccgt cacctcgaccggccgctgcg ggagatcgtc gccgccgggg aggagctgga 4920 cctcaccgcg tacacccagcccgccctctt cgccttcgag gtggcgctgt tccgcctcct 4980 cgaacaccac ggcctcgtccccgacctgct caccggccac tccgtcggcg agatcgccgc 5040 cgcgcacgtc gccggtgtcctctccctcga cgacgccgca cgtctcgtca ccgcccgcgg 5100 ccggctcatg cagtcggcccgcgagggcgg cgcgatgatc gccgtgcagg cgggcgaggc 5160 cgaggtcgtc gagtccctgaagggctacga gggcagggtc gccgtcgccg ccgtcaacgg 5220 acccaccgcc gtggtcgtctccggcgacgc ggacgccgcc gaggagatcc gcgccgtatg 5280 ggcgggacgc ggccggcgcacccgcaggct gcgcgtcagc cacgccttcc actccccgca 5340 catggacgac gtcctcgacgagttcctccg ggtcgccgag ggcctgacct tcgaggagcc 5400 gcggatcccc gtcgtctccacggtcaccgg cgcgctcgtc acgtccggcg agctcacctc 5460 gcccgcgtac tgggtcgaccagatccggcg gcccgtgcgc ttcctggacg ccgtccgcac 5520 cctggccgcc caggacgcgaccgtcctcgt cgagatcggc cccgacgccg tcctcacggc 5580 actcgccgag gaggctctcgcgcccggcac ggacgccccg gacgcccggg acgtcacggt 5640 cgtcccgctg ctgcgcgcggggcgccccga gcccgagacc ctcgccgccg gtctcgcgac 5700 cgcccatgtc cacggcgcacccttggaccg ggcgtcgttc ttcccggacg ggcgccgcac 5760 ggacctgccc acgtacgccttccggcgcga gcactactgg ctgacgcccg aggcccgtac 5820 ggacgcccgc gcactcggcttcgacccggc gcggcacccg ctgctgacga ccacggtcga 5880 ggtcgccggc ggcgacggcgtcctgctgac cggccgtctc tccctgaccg accagccctg 5940 gctggccgac cacatggtcaacggcgccgt cctgttgccg gccaccgcct tcctggagct 6000 cgccctcgcg gcgggcgaccacgtcggggc ggtccgggtg gaggaactca ccctcgaagc 6060 gccgctcgtc ctgcccgagcggggcgccgt ccgcatccag gtcggcgtga gcggcgacgg 6120 cgagtcgccg gccgggcgcaccttcggtgt gtacagcacc cccgactccg gcgacaccgg 6180 tgacgacgcg ccccgggagtggacccgcca tgtctccggc gtactcggcg aaggggaccc 6240 ggccacggag tcggaccaccccggcaccga cggggacggt tcagcggcct ggccgcctgc 6300 ggcggcgacc gccacacccctcgacggcgt ctacgaccgg ctcgcggagc tcggctacgg 6360 atacggtccg gccttccagggcctgacggg gctgtggcgc gacggcgccg acacgctcgc 6420 cgagatccgg ctgcccgcggcgcagcacga gagcgcgggg ctcttcggcg tacacccggc 6480 gctgctcgac gcggcgctccacccgatcgt cctggagggc aactcagctg ccggtgcctg 6540 tgacgccgat accgacgcgaccgaccggat ccggctgccg ttcgcgtggg cgggggtgac 6600 cctccacgcc gaaggggccaccgcgctccg cgtacggatc acacccaccg gcccggacac 6660 ggtcacgctc cgcctcaccgacaccaccgg tgcgcccgtg gccaccgtgg agtccctgac 6720 cctgcgcgcg gtggcgaaggaccggctggg caccaccgcc gggcgcgtcg acgacgccct 6780 gttcacggtc gtgtggacggagaccggcac accggaaccc gcagggcgcg gagccgtgga 6840 ggtcgaggaa ctcgtcgacctcgccggcct cggcgacctc gtggagctcg gcgccgcgga 6900 cgtcgtcctc cgggccgaccgctggacgct cgacggggac ccgtccgccg ccgcgcgcac 6960 agccgtccgg cgcaccctcgccatcgtcca ggagttcctg tccgagccgc gcttcgacgg 7020 ctcgcgactg gtgtgcgtcaccaggggcgc ggtcgccgca ctccccggcg aggacgtcac 7080 ctccctcgcc accggccccctctggggcct cgtccgctcc gcccagtccg agaacccggg 7140 acgcctgttc ctcctggacctgggtgaagg cgaaggcgag cgcgacggag ccgaggagct 7200 gatccgcgcg gccacggccggggacgagcc gcagctcgcg gcacgggacg gccgactgct 7260 cgcgccgagg ctggcccgtaccgccgccct ttcgagtgag gacaccgccg gcggcgccga 7320 ccgtttcggc cccgacggcaccgtcctcgt caccgggggc accggaggcc tcggagcgct 7380 cctcgcccgc cacctcgtggagcgtcacgg ggtgcgccgg ctgctgctgg tgagccgccg 7440 cggggccgac gcccccggcgcggccgacct gggcgaggac ctcgcgggcc tcggcgcgga 7500 ggtggcgttc gccgccgccgacgccgccga ccgcgagagc ctggcgcggg cgatcgccac 7560 cgtgcccgcc gagcatccgctgacggccgt cgtgcacacg gcgggagtcg tcgacgacgc 7620 gacggtggag gcgctcacaccggaacggct ggacgcggta ctgcgcccga aggtcgacgc 7680 cgcgtggaac ctgcacgagctcaccaagga cctgcggctc gacgccttcg tcctcttctc 7740 ctccgtctcc ggcatcgtcggcaccgccgg ccaggccaac tacgcggcgg ccaacacggg 7800 cctcgacgcc ctcgccgcccaccgcgccgc cacgggcctg gccgccacgt cgctggcctg 7860 gggcctctgg gacggcacgcacggcatggg cggcacgctc ggcgccgccg acctcgcccg 7920 ctggagccgg gccggaatcaccccgctcac cccgctgcag ggcctcgcgc tcttcgacgc 7980 cgcggtcgcc agggacgacgccctcctcgt acccgccggg ctccgtccca ccgcccaccg 8040 gggcacggac ggacagcctcctgcgctgtg gcgcggcctc gtccgggcgc gcccgcgccg 8100 tgccgcgcgg acggccgccgaggcggcgga cacgaccggc ggctggctga gcgggctcgc 8160 cgcacagtcc cccgaggagcggcgcagcac agccgtcacg ctcgtgacgg gtgtcgtcgc 8220 ggacgtcctc gggcacgccgactccgccgc ggtcggggcg gagcggtcct tcaaggacct 8280 cggcttcgac tccctggccggggtggagct ccgcaaccgg ctgaacgccg ccaccggcct 8340 gcggctcccc gcgaccacggtcttcgacca tccctcgccg gccgcgctcg cgtcccatct 8400 cctcgcccag gtgcccgggttgaaggaggg gacggcggcg accgcgaccg tcgtggccga 8460 gcggggcgct tccttcggtgaccgtgcgac cgacgacgat ccgatcgcga tcgtgggcat 8520 ggcatgccgc tatccgggtggtgtgtcgtc gccggaggac ctgtggcggc tggtggccga 8580 ggggacggac gcgatcagcgagttccccgt caaccgcggc tgggacctgg agagcctcta 8640 cgacccggat cccgagtcgaagggcaccac gtactgccgg gagggcgggt tcctggaagg 8700 cgccggtgac ttcgacgccgccttcttcgg catctcgccg cgcgaggccc tggtgatgga 8760 cccgcagcag cggctgctgctggaggtgtc ctgggaggcg ctggaacgcg cgggcatcga 8820 cccgtcctcg ctgcgcggcagccgcggtgg tgtctacgtg ggcgccgcgc acggctcgta 8880 cgcctccgat ccccggctggtgcccgaggg ctcggagggc tatctgctga ccggcagcgc 8940 cgacgcggtg atgtccggccgcatctccta cgcgctcggt ctcgaaggac cgtccatgac 9000 ggtggagacg gcctgctcctcctcgctggt ggcgctgcat ctggcggtac gggcgctgcg 9060 gcacggcgag tgcgggctcgcgctggcggg cggggtggcg gtgatggccg atccggcggc 9120 gttcgtggag ttctcccggcagaaggggct ggccgccgac ggccgctgca aggcgttctc 9180 ggccgccgcc gacggcaccggctgggccga gggcgtcggc gtgctcgtcc tggagcggct 9240 gtcggacgcg cgccgcgcggggcacacggt cctcggcctg gtcaccggca ccgcggtcaa 9300 ccaggacggt gcctccaacgggctgaccgc gcccaacggc ccagcccagc aacgcgtcat 9360 cgccgaggcg ctcgccgacgccgggctgtc cccggaggac gtggacgcgg tcgaggcgca 9420 cggcaccggc acccggctcggcgaccccat cgaggccggg gcgctgctcg ccgcctccgg 9480 acggaaccgt tccggcgaccacccgctgtg gctcggctcg ctgaagtcca acatcgggca 9540 tgcccaggcc gccgccggtgtcggcggcgt catcaagatg ctccaggcgc tgcggcacgg 9600 cttgctgccc cgcaccctccacgccgacga gccgaccccg catgccgact ggagctccgg 9660 ccgggtacgg ctgctcacctccgaggtgcc gtggcagcgg accggccggc cccggcggac 9720 cggggtgtcc gccttcggcgtcggcggcac caatgcccat gtcgtcctcg aagaggcacc 9780 cgccccgccc gcgccggaaccggccgggga ggcccccggc ggctcccgcg ccgcagaagg 9840 ggcggaaggg cccctggcctgggtggtctc cggacgcgac gagccggccc tgcggtccca 9900 ggcccggcgg ctccgcgaccacctctcccg cacccccggg gcccgcccgc gtgacatcgc 9960 cttctccctc gccgccacgcgcgcagcctt tgaccaccgc gccgtgctga tcggctcgga 10020 cggggccgaa ctcgccgccgccctggacgc gttggccgaa ggacgcgacg gtccggcggt 10080 ggtgcgcgga gtccgcgaccgggacggcag gatggccttc ctcttcaccg ggcagggcag 10140 ccagcgcgcc gggatggcccacgacctgca tgccgcccat accttcttcg cgtccgccct 10200 cgacgaggtg acggaccgtctcgacccgct gctcggccgg ccgctcggcg cgctgctgga 10260 cgcccgaccc ggctcgcccgaagcggcact cctggaccgg accgagtaca cccagccggc 10320 gctcttcgcc gtcgaggtggcgctccaccg gctgctggag cactggggga tgcgccccga 10380 cctgctgctg gggcactcggtgggcgaact ggcggccgcc cacgtcgcgg gtgtgctcga 10440 tctcgacgac gcctgcgcgctggtggccgc ccgcggcagg ctgatgcagc gcctgccgcc 10500 cggcggcgcg atggtctccgtgcgggccgg cgaggacgag gtccgcgcac tgctggccgg 10560 ccgcgaggac gccgtctgcgtcgccgcggt gaacggcccc cggtcggtgg tgatctccgg 10620 cgcggaggaa gcggtggccgaggcggcggc gcagctcgcc ggacgaggcc gccgcaccag 10680 gcggctccgc gtcgcgcacgccttccactc acccctgatg gacggcatgc tcgccggatt 10740 ccgggaggtc gccgccggcctgcgctaccg ggaaccggag ctgacggtcg tctccacggt 10800 cacggggcgg cccgcccgccccggtgaact caccggcccc gactactggg tggcccaggt 10860 ccgtgagccc gtgcgcttcgcggacgcggt ccgcacggca caccgcctcg gagcccgcac 10920 cttcctggag accggcccggacggcgtgct gtgcggcatg gcagaggagt gcctggagga 10980 cgacaccgtg gccctgctgccggcgatcca caagcccggc accgcgccgc acggtccggc 11040 ggctcccggc gcgctgcgggcggccgccgc cgcgtacggc cggggcgccc gggtggactg 11100 ggccgggatg cacgccgacggccccgaggg gccggcccgc cgcgtcgaac tgcccgtcca 11160 cgccttccgg caccgccgctactggctcgc cccgggccgc gcggcggaca ccgacgactg 11220 gatgtaccgg atcggctgggaccggctgcc ggctgtgacc ggcggggccc ggaccgccgg 11280 ccgctggctg gtgatccaccccgacagccc gcgctgccgg gagctgtccg gccacgccga 11340 acgcgcgctg cgcgccgcgggcgcgagccc cgtaccgctg cccgtggacg ctccggccgc 11400 cgaccgggcg tccttcgcggcactgctgcg ctccgccacc ggacctgaca cacgaggtga 11460 cacagccgcg cccgtggccggtgtgctgtc gctgctgtcc gaggaggatc ggccccatcg 11520 ccagcacgcc ccggtacccgccggggtcct ggcgacgctg tccctgatgc aggctatgga 11580 ggaggaggcg gtggaggctcgcgtgtggtg cgtctcccgc gccgcggtcg ccgccgccga 11640 ccgggaacgg cccgtcggcgcgggcgccgc cctgtggggg ctggggcggg tggccgccct 11700 ggaacgcccc acccggtggggcggtctcgt ggacctgccc gcctcgcccg gtgcggcgca 11760 ctgggcggcc gccgtggaacggctcgccgg tcccgaggac cagatcgccg tgcgcgcgtc 11820 cggcagttgg ggccggcgcctcaccaggct gccgcgcgac ggcggcggcc ggacggccgc 11880 acccgcgtac cggccgcgcggcacggtgct cgtcaccggt ggcaccggcg cgctcggcgg 11940 gcatctcgcc cgctggctcgccgcggcggg cgccgaacac ctggcgctca ccagccgccg 12000 gggcccggac gcgcccggcgccgccggact cgaggccgaa ctcctcctcc tgggcgccaa 12060 ggtgacgttc gccgcctgcgacaccgccga ccgcgacggc ctcgcccggg tcctgcgggc 12120 gataccggag gacaccccgctcaccgcggt gttccacgcc gcgggcgtac cgcaggtcac 12180 gccgctgtcc cgtacctcgcccgagcactt cgccgacgtg tacgcgggca aggcggcggg 12240 cgccgcgcac ctggacgaactgacccgcga actcggcgcc ggactcgacg cgttcgtcct 12300 ctactcctcc ggcgccggcgtctggggcag cgccggccag ggtgcctacg ccgccgccaa 12360 cgccgccctg gacgcgctcgcccggcgccg tgcggcggac ggactccccg ccacctccat 12420 cgcctggggc gtgtggggcggcggcggtat gggggccgac gaggcgggcg cggagtatct 12480 gggccggcgc ggtatgcgccccatggcacc ggtctccgcg ctccgggcga tggccaccgc 12540 catcgcctcc ggggaaccctgccccaccgt cacccacacc gactgggagc gcttcggcga 12600 gggcttcacc gccttccggcccagccctct gatcgcgggg ctcggcacgc cgggcggcgg 12660 ccgggcggcg gagacccccgaggaggggaa cgccaccgct gcggcggacc tcaccgccct 12720 gccgcccgcc gaactccgcaccgcgctgcg cgagctggtg cgagcccgga ccgccgcggc 12780 gctcggcctc gacgacccggccgaggtcgc cgagggcgaa cggttccccg ccatgggctt 12840 cgactccctg gccaccgtacggctgcgccg cggactcgcc tcggccacgg gcctcgacct 12900 gccccccgat ctgctcttcgaccgggacac cccggccgcg ctcgccgccc acctggccga 12960 actgctcgcc accgcacgggaccacggacc cggcggcccc gggaccggtg ccgcgccggc 13020 cgatgccgga agcggcctgccggccctcta ccgggaggcc gtccgcaccg gccgggccgc 13080 ggaaatggcc gaactgctcgccgccgcttc ccggttccgc cccgccttcg ggacggcgga 13140 ccggcagccg gtggccctcgtgccgctggc cgacggcgcg gaggacaccg ggctcccgct 13200 gctcgtgggc tgcgccgggacggcggtggc ctccggcccg gtggagttca ccgccttcgc 13260 cggagcgctg gcggacctcccggcggcggc cccgatggcc gcgctgccgc agcccggctt 13320 tctgccggga gaacgagtcccggccacccc ggaggcattg ttcgaggccc aggcggaagc 13380 gctgctgcgc tacgcggccggccggccctt cgtgctgctg gggcactccg ccggcgccaa 13440 catggcccac gccctgacccgtcatctgga ggcgaacggt ggcggccccg cagggctggt 13500 gctcatggac atctacacccccgccgaccc cggcgcgatg ggcgtctggc ggaacgacat 13560 gttccagtgg gtctggcggcgctcggacat ccccccggac gaccaccgcc tcacggccat 13620 gggcgcctac caccggctgcttctcgactg gtcgcccacc cccgtccgcg cccccgtact 13680 gcatctgcgc gccgcggaacccatgggcga ctggccaccc ggggacaccg gctggcagtc 13740 ccactgggac ggcgcgcacaccaccgccgg catccccgga aaccacttca cgatgatgac 13800 cgaacacgcc tccgccgccgcccggctcgt gcacggctgg ctcgcggaac ggaccccgtc 13860 cgggcagggc gggtcaccgtcccgcgcggc ggggagagag gagaggccgt gaacacggca 13920 gccggcccga ccggcaccgccgccggcggc accaccgccc cggcggcggc acacgacctg 13980 tcccgcgccg gacgcaggctccaactcacc cgggccgcac agtggttcgc cggcaaccag 14040 ggagacccct acgggatgatcctgcgcgcc ggcaccgccg acccggcacc gtacgaggaa 14100 gagatccccg ggtaccgagctcgaattctt aattaaggag gtcgtagatg agtaacaaga 14160 acaacgatga gctgcagcggcaggcctcgg aaaacaccct ggggctgaac ccggtcatcg 14220 gtatccgccg caaagacctgttgagctcgg cacgcaccgt gctgcgccag gccgtgcgcc 14280 aaccgctgca cagcgccaagcatgtggccc actttggcct ggagctgaag aacgtgctgc 14340 tgggcaagtc cagccttgccccggaaagcg acgaccgtcg cttcaatgac ccggcatgga 14400 gcaacaaccc actttaccgccgctacctgc aaacctatct ggcctggcgc aaggagctgc 14460 aggactggat cggcaacagcgacctgtcgc cccaggacat cagccgcggc cagttcgtca 14520 tcaacctgat gaccgaagccatggctccga ccaacaccct gtccaacccg gcagcagtca 14580 aacgcttctt cgaaaccggcggcaagagcc tgctcgatgg cctgtccaac ctggccaagg 14640 acctggtcaa caacggtggcatgcccagcc aggtgaacat ggacgccttc gaggtgggca 14700 agaacctggg caccagtgaaggcgccgtgg tgtaccgcaa cgatgtgctg gagctgatcc 14760 agtacaagcc catcaccgagcaggtgcatg cccgcccgct gctggtggtg ccgccgcaga 14820 tcaacaagtt ctacgtattcgacctgagcc cggaaaagag cctggcacgc tactgcctgc 14880 gctcgcagca gcagaccttcatcatcagct ggcgcaaccc gaccaaagcc cagcgcgaat 14940 ggggcctgtc cacctacatcgacgcgctca aggaggcggt cgacgcggtg ctggcgatta 15000 ccggcagcaa ggacctgaacatgctcggtg cctgctccgg cggcatcacc tgcacggcat 15060 tggtcggcca ctatgccgccctcggcgaaa acaaggtcaa tgccctgacc ctgctggtca 15120 gcgtgctgga caccaccatggacaaccagg tcgccctgtt cgtcgacgag cagactttgg 15180 aggccgccaa gcgccactcctaccaggccg gtgtgctcga aggcagcgag atggccaagg 15240 tgttcgcctg gatgcgccccaacgacctga tctggaacta ctgggtcaac aactacctgc 15300 tcggcaacga gccgccggtgttcgacatcc tgttctggaa caacgacacc acgcgcctgc 15360 cggccgcctt ccacggcgacctgatcgaaa tgttcaagag caacccgctg acccgcccgg 15420 acgccctgga ggtttgcggcactccgatcg acctgaaaca ggtcaaatgc gacatctaca 15480 gccttgccgg caccaacgaccacatcaccc cgtggcagtc atgctaccgc tcggcgcacc 15540 tgttcggcgg caagatcgagttcgtgctgt ccaacagcgg ccacatccag agcatcctca 15600 acccgccagg caaccccaaggcgcgcttca tgaccggtgc cgatcgcccg ggtgacccgg 15660 tggcctggca ggaaaacgccaccaagcatg ccgactcctg gtggctgcac tggcaaagct 15720 ggctgggcga gcgtgccggcgagctggaaa aggcgccgac ccgcctgggc aaccgtgcct 15780 atgccgctgg cgaggcatccccgggcacct acgttcacga gcgttgagct gcagcgccgt 15840 ggccacctgc gggacgccacggtgttgaat tc 15872 2 5215 PRT Streptomyces venezuelae 2 Met Asn Glu AlaIle Ala Val Val Gly Met Ser Cys Arg Leu Pro Lys 1 5 10 15 Ala Ser AsnPro Ala Ala Phe Trp Glu Leu Leu Arg Asn Gly Glu Ser 20 25 30 Ala Val ThrAsp Val Pro Ser Gly Arg Trp Thr Ser Val Leu Gly Gly 35 40 45 Ala Asp AlaGlu Glu Pro Ala Glu Ser Gly Val Arg Arg Gly Gly Phe 50 55 60 Leu Asp SerLeu Asp Leu Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro 65 70 75 80 Arg GluAla Ala Ala Met Asp Pro Gln Gln Arg Leu Val Leu Glu Leu 85 90 95 Ala TrpGlu Ala Leu Glu Asp Ala Gly Ile Val Pro Gly Thr Leu Ala 100 105 110 GlySer Arg Thr Ala Val Phe Val Gly Thr Leu Arg Asp Asp Tyr Thr 115 120 125Ser Leu Leu Tyr Gln His Gly Glu Gln Ala Ile Thr Gln His Thr Met 130 135140 Ala Gly Val Asn Arg Gly Val Ile Ala Asn Arg Val Ser Tyr His Leu 145150 155 160 Gly Leu Gln Gly Pro Ser Leu Thr Val Asp Ala Ala Gln Ser SerSer 165 170 175 Leu Val Ala Val His Leu Ala Cys Glu Ser Leu Arg Ala GlyGlu Ser 180 185 190 Thr Thr Ala Leu Val Ala Gly Val Asn Leu Asn Ile LeuAla Glu Ser 195 200 205 Ala Val Thr Glu Glu Arg Phe Gly Gly Leu Ser ProAsp Gly Thr Ala 210 215 220 Tyr Thr Phe Asp Ala Arg Ala Asn Gly Phe ValArg Gly Glu Gly Gly 225 230 235 240 Gly Val Val Val Leu Lys Pro Leu SerArg Ala Leu Ala Asp Gly Asp 245 250 255 Arg Val His Gly Val Ile Arg AlaSer Ala Val Asn Asn Asp Gly Ala 260 265 270 Thr Pro Gly Leu Thr Val ProSer Arg Ala Ala Gln Glu Lys Val Leu 275 280 285 Arg Glu Ala Tyr Arg LysAla Ala Leu Asp Pro Ser Ala Val Gln Tyr 290 295 300 Val Glu Leu His GlyThr Gly Thr Pro Val Gly Asp Pro Ile Glu Ala 305 310 315 320 Ala Ala LeuGly Ala Val Leu Gly Ser Ala Arg Pro Ala Asp Glu Pro 325 330 335 Leu LeuVal Gly Ser Ala Lys Thr Asn Val Gly His Leu Glu Gly Ala 340 345 350 AlaGly Ile Val Gly Leu Ile Lys Thr Leu Leu Ala Leu Gly Arg Arg 355 360 365Arg Ile Pro Ala Ser Leu Asn Phe Arg Thr Pro His Pro Asp Ile Pro 370 375380 Leu Asp Thr Leu Gly Leu Asp Val Pro Asp Gly Leu Arg Glu Trp Pro 385390 395 400 His Pro Asp Arg Glu Leu Leu Ala Gly Val Ser Ser Phe Gly MetGly 405 410 415 Gly Thr Asn Ala His Val Val Leu Ser Glu Gly Pro Ala GlnGly Gly 420 425 430 Glu Gln Pro Gly Ile Asp Glu Glu Thr Pro Val Asp SerGly Ala Ala 435 440 445 Leu Pro Phe Val Val Thr Gly Arg Gly Gly Glu AlaLeu Arg Ala Gln 450 455 460 Ala Arg Arg Leu His Glu Ala Val Glu Ala AspPro Glu Leu Ala Pro 465 470 475 480 Ala Ala Leu Ala Arg Ser Leu Val ThrThr Arg Thr Val Phe Thr His 485 490 495 Arg Ser Val Val Leu Ala Pro AspArg Ala Arg Leu Leu Asp Gly Leu 500 505 510 Gly Ala Leu Ala Ala Gly ThrPro Ala Pro Gly Val Val Thr Gly Thr 515 520 525 Pro Ala Pro Gly Arg LeuAla Val Leu Phe Ser Gly Gln Gly Ala Gln 530 535 540 Arg Thr Gly Met GlyMet Glu Leu Tyr Ala Ala His Pro Ala Phe Ala 545 550 555 560 Thr Ala PheAsp Ala Val Ala Ala Glu Leu Asp Pro Leu Leu Asp Arg 565 570 575 Pro LeuAla Glu Leu Val Ala Ala Gly Asp Thr Leu Asp Arg Thr Val 580 585 590 HisThr Gln Pro Ala Leu Phe Ala Val Glu Val Ala Leu His Arg Leu 595 600 605Val Glu Ser Trp Gly Val Thr Pro Asp Leu Leu Ala Gly His Ser Val 610 615620 Gly Glu Ile Ser Ala Ala His Val Ala Gly Val Leu Ser Leu Arg Asp 625630 635 640 Ala Ala Arg Leu Val Ala Ala Arg Gly Arg Leu Met Gln Ala LeuPro 645 650 655 Glu Gly Gly Ala Met Val Ala Val Glu Ala Ser Glu Glu GluVal Leu 660 665 670 Pro His Leu Ala Gly Arg Glu Arg Glu Leu Ser Leu AlaAla Val Asn 675 680 685 Gly Pro Arg Ala Val Val Leu Ala Gly Ala Glu ArgAla Val Leu Asp 690 695 700 Val Ala Glu Leu Leu Arg Glu Gln Gly Arg ArgThr Lys Arg Leu Ser 705 710 715 720 Val Ser His Ala Phe His Ser Pro LeuMet Glu Pro Met Leu Asp Asp 725 730 735 Phe Arg Arg Val Val Glu Glu LeuAsp Phe Gln Glu Pro Arg Val Asp 740 745 750 Val Val Ser Thr Val Thr GlyLeu Pro Val Thr Ala Gly Gln Trp Thr 755 760 765 Asp Pro Glu Tyr Trp ValAsp Gln Val Arg Arg Pro Val Arg Phe Leu 770 775 780 Asp Ala Val Arg ThrLeu Glu Glu Ser Gly Ala Asp Thr Phe Leu Glu 785 790 795 800 Leu Gly ProAsp Gly Val Cys Ser Ala Met Ala Ala Asp Ser Val Arg 805 810 815 Asp GlnGlu Ala Ala Thr Ala Val Ser Ala Leu Arg Lys Gly Arg Pro 820 825 830 GluPro Gln Ser Leu Leu Ala Ala Leu Thr Thr Val Phe Val Arg Gly 835 840 845His Asp Val Asp Trp Thr Ala Ala His Gly Ser Thr Gly Thr Val Arg 850 855860 Val Pro Leu Pro Thr Tyr Ala Phe Gln Arg Glu Arg His Trp Phe Asp 865870 875 880 Gly Ala Ala Arg Thr Ala Ala Pro Leu Thr Ala Gly Arg Ser GlyThr 885 890 895 Gly Ala Gly Thr Gly Pro Ala Ala Gly Val Thr Ser Gly GluGly Glu 900 905 910 Gly Glu Gly Glu Gly Ala Gly Ala Gly Gly Gly Asp ArgPro Ala Arg 915 920 925 His Glu Thr Thr Glu Arg Val Arg Ala His Val AlaAla Val Leu Glu 930 935 940 Tyr Asp Asp Pro Thr Arg Val Glu Leu Gly LeuThr Phe Lys Glu Leu 945 950 955 960 Gly Phe Asp Ser Leu Met Ser Val GluLeu Arg Asn Ala Leu Val Asp 965 970 975 Asp Thr Gly Leu Arg Leu Pro SerGly Leu Leu Phe Asp His Pro Thr 980 985 990 Pro Arg Ala Leu Ala Ala HisLeu Gly Asp Leu Leu Thr Gly Gly Ser 995 1000 1005 Gly Glu Thr Gly SerAla Asp Gly Ile Pro Pro Ala Thr Pro Ala Asp 1010 1015 1020 Thr Thr AlaGlu Pro Ile Ala Ile Ile Gly Met Ala Cys Arg Tyr Pro 1025 1030 1035 1040Gly Gly Val Thr Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Glu Gly 10451050 1055 Arg Asp Ala Val Ser Gly Leu Pro Thr Asp Arg Gly Trp Asp GluAsp 1060 1065 1070 Leu Phe Asp Ala Asp Pro Asp Arg Ser Gly Lys Ser SerVal Arg Glu 1075 1080 1085 Gly Gly Phe Leu His Asp Ala Ala Leu Phe AspAla Gly Phe Phe Gly 1090 1095 1100 Ile Ser Pro Arg Glu Ala Leu Gly MetAsp Pro Gln Gln Arg Leu Leu 1105 1110 1115 1120 Leu Glu Thr Ala Trp GluAla Val Glu Arg Ala Gly Leu Asp Pro Glu 1125 1130 1135 Gly Leu Lys GlySer Arg Thr Ala Val Phe Val Gly Ala Thr Ala Leu 1140 1145 1150 Asp TyrGly Pro Arg Met His Asp Gly Ala Glu Gly Val Glu Gly His 1155 1160 1165Leu Leu Thr Gly Thr Thr Pro Ser Val Met Ser Gly Arg Ile Ala Tyr 11701175 1180 Gln Leu Gly Leu Thr Gly Pro Ala Val Thr Val Asp Thr Ala CysSer 1185 1190 1195 1200 Ser Ser Leu Val Ala Leu His Leu Ala Val Arg SerLeu Arg Gln Gly 1205 1210 1215 Glu Ser Ser Leu Ala Leu Ala Gly Gly AlaThr Val Met Ser Thr Pro 1220 1225 1230 Gly Met Phe Val Glu Phe Ser ArgGln Arg Gly Leu Ala Ala Asp Gly 1235 1240 1245 Arg Ser Lys Ala Phe SerAsp Ser Ala Asp Gly Thr Ser Trp Ala Glu 1250 1255 1260 Gly Val Gly LeuLeu Val Val Glu Arg Leu Ser Asp Ala Glu Arg Asn 1265 1270 1275 1280 GlyHis Pro Val Leu Ala Val Ile Arg Gly Ser Ala Val Asn Gln Asp 1285 12901295 Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg1300 1305 1310 Val Ile Arg Gln Ala Leu Ala Asp Ala Gly Leu Thr Pro AlaAsp Val 1315 1320 1325 Asp Ala Val Glu Ala His Gly Thr Gly Thr Arg LeuGly Asp Pro Ile 1330 1335 1340 Glu Ala Glu Ala Ile Leu Gly Thr Tyr GlyArg Asp Arg Gly Glu Gly 1345 1350 1355 1360 Ala Pro Leu Gln Leu Gly SerLeu Lys Ser Asn Ile Gly His Ala Gln 1365 1370 1375 Ala Ala Ala Gly ValGly Gly Leu Ile Lys Met Val Leu Ala Met Arg 1380 1385 1390 His Gly ValLeu Pro Arg Thr Leu His Val Asp Arg Pro Thr Thr Arg 1395 1400 1405 ValAsp Trp Glu Ala Gly Gly Val Glu Leu Leu Thr Glu Glu Arg Glu 1410 14151420 Trp Pro Glu Thr Gly Arg Pro Arg Arg Ala Ala Ile Ser Ser Phe Gly1425 1430 1435 1440 Ile Ser Gly Thr Asn Ala His Ile Val Val Glu Gln AlaPro Glu Ala 1445 1450 1455 Gly Glu Ala Ala Val Thr Thr Thr Ala Pro GluAla Gly Glu Ala Gly 1460 1465 1470 Glu Ala Ala Asp Thr Thr Ala Thr ThrThr Pro Ala Ala Val Gly Val 1475 1480 1485 Pro Glu Pro Val Arg Ala ProVal Val Val Ser Ala Arg Asp Ala Ala 1490 1495 1500 Ala Leu Arg Ala GlnAla Val Arg Leu Arg Thr Phe Leu Asp Gly Arg 1505 1510 1515 1520 Pro AspVal Thr Val Ala Asp Leu Gly Arg Ser Leu Ala Ala Arg Thr 1525 1530 1535Ala Phe Glu His Lys Ala Ala Leu Thr Thr Ala Thr Arg Asp Glu Leu 15401545 1550 Leu Ala Gly Leu Asp Ala Leu Gly Arg Gly Glu Gln Ala Thr GlyLeu 1555 1560 1565 Val Thr Gly Glu Pro Ala Arg Ala Gly Arg Thr Ala PheLeu Phe Thr 1570 1575 1580 Gly Gln Gly Ala Gln Arg Val Ala Met Gly GluGlu Leu Arg Ala Ala 1585 1590 1595 1600 His Pro Val Phe Ala Ala Ala LeuAsp Thr Val Tyr Ala Ala Leu Asp 1605 1610 1615 Arg His Leu Asp Arg ProLeu Arg Glu Ile Val Ala Ala Gly Glu Glu 1620 1625 1630 Leu Asp Leu ThrAla Tyr Thr Gln Pro Ala Leu Phe Ala Phe Glu Val 1635 1640 1645 Ala LeuPhe Arg Leu Leu Glu His His Gly Leu Val Pro Asp Leu Leu 1650 1655 1660Thr Gly His Ser Val Gly Glu Ile Ala Ala Ala His Val Ala Gly Val 16651670 1675 1680 Leu Ser Leu Asp Asp Ala Ala Arg Leu Val Thr Ala Arg GlyArg Leu 1685 1690 1695 Met Gln Ser Ala Arg Glu Gly Gly Ala Met Ile AlaVal Gln Ala Gly 1700 1705 1710 Glu Ala Glu Val Val Glu Ser Leu Lys GlyTyr Glu Gly Arg Val Ala 1715 1720 1725 Val Ala Ala Val Asn Gly Pro ThrAla Val Val Val Ser Gly Asp Ala 1730 1735 1740 Asp Ala Ala Glu Glu IleArg Ala Val Trp Ala Gly Arg Gly Arg Arg 1745 1750 1755 1760 Thr Arg ArgLeu Arg Val Ser His Ala Phe His Ser Pro His Met Asp 1765 1770 1775 AspVal Leu Asp Glu Phe Leu Arg Val Ala Glu Gly Leu Thr Phe Glu 1780 17851790 Glu Pro Arg Ile Pro Val Val Ser Thr Val Thr Gly Ala Leu Val Thr1795 1800 1805 Ser Gly Glu Leu Thr Ser Pro Ala Tyr Trp Val Asp Gln IleArg Arg 1810 1815 1820 Pro Val Arg Phe Leu Asp Ala Val Arg Thr Leu AlaAla Gln Asp Ala 1825 1830 1835 1840 Thr Val Leu Val Glu Ile Gly Pro AspAla Val Leu Thr Ala Leu Ala 1845 1850 1855 Glu Glu Ala Leu Ala Pro GlyThr Asp Ala Pro Asp Ala Arg Asp Val 1860 1865 1870 Thr Val Val Pro LeuLeu Arg Ala Gly Arg Pro Glu Pro Glu Thr Leu 1875 1880 1885 Ala Ala GlyLeu Ala Thr Ala His Val His Gly Ala Pro Leu Asp Arg 1890 1895 1900 AlaSer Phe Phe Pro Asp Gly Arg Arg Thr Asp Leu Pro Thr Tyr Ala 1905 19101915 1920 Phe Arg Arg Glu His Tyr Trp Leu Thr Pro Glu Ala Arg Thr AspAla 1925 1930 1935 Arg Ala Leu Gly Phe Asp Pro Ala Arg His Pro Leu LeuThr Thr Thr 1940 1945 1950 Val Glu Val Ala Gly Gly Asp Gly Val Leu LeuThr Gly Arg Leu Ser 1955 1960 1965 Leu Thr Asp Gln Pro Trp Leu Ala AspHis Met Val Asn Gly Ala Val 1970 1975 1980 Leu Leu Pro Ala Thr Ala PheLeu Glu Leu Ala Leu Ala Ala Gly Asp 1985 1990 1995 2000 His Val Gly AlaVal Arg Val Glu Glu Leu Thr Leu Glu Ala Pro Leu 2005 2010 2015 Val LeuPro Glu Arg Gly Ala Val Arg Ile Gln Val Gly Val Ser Gly 2020 2025 2030Asp Gly Glu Ser Pro Ala Gly Arg Thr Phe Gly Val Tyr Ser Thr Pro 20352040 2045 Asp Ser Gly Asp Thr Gly Asp Asp Ala Pro Arg Glu Trp Thr ArgHis 2050 2055 2060 Val Ser Gly Val Leu Gly Glu Gly Asp Pro Ala Thr GluSer Asp His 2065 2070 2075 2080 Pro Gly Thr Asp Gly Asp Gly Ser Ala AlaTrp Pro Pro Ala Ala Ala 2085 2090 2095 Thr Ala Thr Pro Leu Asp Gly ValTyr Asp Arg Leu Ala Glu Leu Gly 2100 2105 2110 Tyr Gly Tyr Gly Pro AlaPhe Gln Gly Leu Thr Gly Leu Trp Arg Asp 2115 2120 2125 Gly Ala Asp ThrLeu Ala Glu Ile Arg Leu Pro Ala Ala Gln His Glu 2130 2135 2140 Ser AlaGly Leu Phe Gly Val His Pro Ala Leu Leu Asp Ala Ala Leu 2145 2150 21552160 His Pro Ile Val Leu Glu Gly Asn Ser Ala Ala Gly Ala Cys Asp Ala2165 2170 2175 Asp Thr Asp Ala Thr Asp Arg Ile Arg Leu Pro Phe Ala TrpAla Gly 2180 2185 2190 Val Thr Leu His Ala Glu Gly Ala Thr Ala Leu ArgVal Arg Ile Thr 2195 2200 2205 Pro Thr Gly Pro Asp Thr Val Thr Leu ArgLeu Thr Asp Thr Thr Gly 2210 2215 2220 Ala Pro Val Ala Thr Val Glu SerLeu Thr Leu Arg Ala Val Ala Lys 2225 2230 2235 2240 Asp Arg Leu Gly ThrThr Ala Gly Arg Val Asp Asp Ala Leu Phe Thr 2245 2250 2255 Val Val TrpThr Glu Thr Gly Thr Pro Glu Pro Ala Gly Arg Gly Ala 2260 2265 2270 ValGlu Val Glu Glu Leu Val Asp Leu Ala Gly Leu Gly Asp Leu Val 2275 22802285 Glu Leu Gly Ala Ala Asp Val Val Leu Arg Ala Asp Arg Trp Thr Leu2290 2295 2300 Asp Gly Asp Pro Ser Ala Ala Ala Arg Thr Ala Val Arg ArgThr Leu 2305 2310 2315 2320 Ala Ile Val Gln Glu Phe Leu Ser Glu Pro ArgPhe Asp Gly Ser Arg 2325 2330 2335 Leu Val Cys Val Thr Arg Gly Ala ValAla Ala Leu Pro Gly Glu Asp 2340 2345 2350 Val Thr Ser Leu Ala Thr GlyPro Leu Trp Gly Leu Val Arg Ser Ala 2355 2360 2365 Gln Ser Glu Asn ProGly Arg Leu Phe Leu Leu Asp Leu Gly Glu Gly 2370 2375 2380 Glu Gly GluArg Asp Gly Ala Glu Glu Leu Ile Arg Ala Ala Thr Ala 2385 2390 2395 2400Gly Asp Glu Pro Gln Leu Ala Ala Arg Asp Gly Arg Leu Leu Ala Pro 24052410 2415 Arg Leu Ala Arg Thr Ala Ala Leu Ser Ser Glu Asp Thr Ala GlyGly 2420 2425 2430 Ala Asp Arg Phe Gly Pro Asp Gly Thr Val Leu Val ThrGly Gly Thr 2435 2440 2445 Gly Gly Leu Gly Ala Leu Leu Ala Arg His LeuVal Glu Arg His Gly 2450 2455 2460 Val Arg Arg Leu Leu Leu Val Ser ArgArg Gly Ala Asp Ala Pro Gly 2465 2470 2475 2480 Ala Ala Asp Leu Gly GluAsp Leu Ala Gly Leu Gly Ala Glu Val Ala 2485 2490 2495 Phe Ala Ala AlaAsp Ala Ala Asp Arg Glu Ser Leu Ala Arg Ala Ile 2500 2505 2510 Ala ThrVal Pro Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala 2515 2520 2525Gly Val Val Asp Asp Ala Thr Val Glu Ala Leu Thr Pro Glu Arg Leu 25302535 2540 Asp Ala Val Leu Arg Pro Lys Val Asp Ala Ala Trp Asn Leu HisGlu 2545 2550 2555 2560 Leu Thr Lys Asp Leu Arg Leu Asp Ala Phe Val LeuPhe Ser Ser Val 2565 2570 2575 Ser Gly Ile Val Gly Thr Ala Gly Gln AlaAsn Tyr Ala Ala Ala Asn 2580 2585 2590 Thr Gly Leu Asp Ala Leu Ala AlaHis Arg Ala Ala Thr Gly Leu Ala 2595 2600 2605 Ala Thr Ser Leu Ala TrpGly Leu Trp Asp Gly Thr His Gly Met Gly 2610 2615 2620 Gly Thr Leu GlyAla Ala Asp Leu Ala Arg Trp Ser Arg Ala Gly Ile 2625 2630 2635 2640 ThrPro Leu Thr Pro Leu Gln Gly Leu Ala Leu Phe Asp Ala Ala Val 2645 26502655 Ala Arg Asp Asp Ala Leu Leu Val Pro Ala Gly Leu Arg Pro Thr Ala2660 2665 2670 His Arg Gly Thr Asp Gly Gln Pro Pro Ala Leu Trp Arg GlyLeu Val 2675 2680 2685 Arg Ala Arg Pro Arg Arg Ala Ala Arg Thr Ala AlaGlu Ala Ala Asp 2690 2695 2700 Thr Thr Gly Gly Trp Leu Ser Gly Leu AlaAla Gln Ser Pro Glu Glu 2705 2710 2715 2720 Arg Arg Ser Thr Ala Val ThrLeu Val Thr Gly Val Val Ala Asp Val 2725 2730 2735 Leu Gly His Ala AspSer Ala Ala Val Gly Ala Glu Arg Ser Phe Lys 2740 2745 2750 Asp Leu GlyPhe Asp Ser Leu Ala Gly Val Glu Leu Arg Asn Arg Leu 2755 2760 2765 AsnAla Ala Thr Gly Leu Arg Leu Pro Ala Thr Thr Val Phe Asp His 2770 27752780 Pro Ser Pro Ala Ala Leu Ala Ser His Leu Leu Ala Gln Val Pro Gly2785 2790 2795 2800 Leu Lys Glu Gly Thr Ala Ala Thr Ala Thr Val Val AlaGlu Arg Gly 2805 2810 2815 Ala Ser Phe Gly Asp Arg Ala Thr Asp Asp AspPro Ile Ala Ile Val 2820 2825 2830 Gly Met Ala Cys Arg Tyr Pro Gly GlyVal Ser Ser Pro Glu Asp Leu 2835 2840 2845 Trp Arg Leu Val Ala Glu GlyThr Asp Ala Ile Ser Glu Phe Pro Val 2850 2855 2860 Asn Arg Gly Trp AspLeu Glu Ser Leu Tyr Asp Pro Asp Pro Glu Ser 2865 2870 2875 2880 Lys GlyThr Thr Tyr Cys Arg Glu Gly Gly Phe Leu Glu Gly Ala Gly 2885 2890 2895Asp Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Val 29002905 2910 Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Val Ser Trp Glu AlaLeu 2915 2920 2925 Glu Arg Ala Gly Ile Asp Pro Ser Ser Leu Arg Gly SerArg Gly Gly 2930 2935 2940 Val Tyr Val Gly Ala Ala His Gly Ser Tyr AlaSer Asp Pro Arg Leu 2945 2950 2955 2960 Val Pro Glu Gly Ser Glu Gly TyrLeu Leu Thr Gly Ser Ala Asp Ala 2965 2970 2975 Val Met Ser Gly Arg IleSer Tyr Ala Leu Gly Leu Glu Gly Pro Ser 2980 2985 2990 Met Thr Val GluThr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu 2995 3000 3005 Ala ValArg Ala Leu Arg His Gly Glu Cys Gly Leu Ala Leu Ala Gly 3010 3015 3020Gly Val Ala Val Met Ala Asp Pro Ala Ala Phe Val Glu Phe Ser Arg 30253030 3035 3040 Gln Lys Gly Leu Ala Ala Asp Gly Arg Cys Lys Ala Phe SerAla Ala 3045 3050 3055 Ala Asp Gly Thr Gly Trp Ala Glu Gly Val Gly ValLeu Val Leu Glu 3060 3065 3070 Arg Leu Ser Asp Ala Arg Arg Ala Gly HisThr Val Leu Gly Leu Val 3075 3080 3085 Thr Gly Thr Ala Val Asn Gln AspGly Ala Ser Asn Gly Leu Thr Ala 3090 3095 3100 Pro Asn Gly Pro Ala GlnGln Arg Val Ile Ala Glu Ala Leu Ala Asp 3105 3110 3115 3120 Ala Gly LeuSer Pro Glu Asp Val Asp Ala Val Glu Ala His Gly Thr 3125 3130 3135 GlyThr Arg Leu Gly Asp Pro Ile Glu Ala Gly Ala Leu Leu Ala Ala 3140 31453150 Ser Gly Arg Asn Arg Ser Gly Asp His Pro Leu Trp Leu Gly Ser Leu3155 3160 3165 Lys Ser Asn Ile Gly His Ala Gln Ala Ala Ala Gly Val GlyGly Val 3170 3175 3180 Ile Lys Met Leu Gln Ala Leu Arg His Gly Leu LeuPro Arg Thr Leu 3185 3190 3195 3200 His Ala Asp Glu Pro Thr Pro His AlaAsp Trp Ser Ser Gly Arg Val 3205 3210 3215 Arg Leu Leu Thr Ser Glu ValPro Trp Gln Arg Thr Gly Arg Pro Arg 3220 3225 3230 Arg Thr Gly Val SerAla Phe Gly Val Gly Gly Thr Asn Ala His Val 3235 3240 3245 Val Leu GluGlu Ala Pro Ala Pro Pro Ala Pro Glu Pro Ala Gly Glu 3250 3255 3260 AlaPro Gly Gly Ser Arg Ala Ala Glu Gly Ala Glu Gly Pro Leu Ala 3265 32703275 3280 Trp Val Val Ser Gly Arg Asp Glu Pro Ala Leu Arg Ser Gln AlaArg 3285 3290 3295 Arg Leu Arg Asp His Leu Ser Arg Thr Pro Gly Ala ArgPro Arg Asp 3300 3305 3310 Ile Ala Phe Ser Leu Ala Ala Thr Arg Ala AlaPhe Asp His Arg Ala 3315 3320 3325 Val Leu Ile Gly Ser Asp Gly Ala GluLeu Ala Ala Ala Leu Asp Ala 3330 3335 3340 Leu Ala Glu Gly Arg Asp GlyPro Ala Val Val Arg Gly Val Arg Asp 3345 3350 3355 3360 Arg Asp Gly ArgMet Ala Phe Leu Phe Thr Gly Gln Gly Ser Gln Arg 3365 3370 3375 Ala GlyMet Ala His Asp Leu His Ala Ala His Thr Phe Phe Ala Ser 3380 3385 3390Ala Leu Asp Glu Val Thr Asp Arg Leu Asp Pro Leu Leu Gly Arg Pro 33953400 3405 Leu Gly Ala Leu Leu Asp Ala Arg Pro Gly Ser Pro Glu Ala AlaLeu 3410 3415 3420 Leu Asp Arg Thr Glu Tyr Thr Gln Pro Ala Leu Phe AlaVal Glu Val 3425 3430 3435 3440 Ala Leu His Arg Leu Leu Glu His Trp GlyMet Arg Pro Asp Leu Leu 3445 3450 3455 Leu Gly His Ser Val Gly Glu LeuAla Ala Ala His Val Ala Gly Val 3460 3465 3470 Leu Asp Leu Asp Asp AlaCys Ala Leu Val Ala Ala Arg Gly Arg Leu 3475 3480 3485 Met Gln Arg LeuPro Pro Gly Gly Ala Met Val Ser Val Arg Ala Gly 3490 3495 3500 Glu AspGlu Val Arg Ala Leu Leu Ala Gly Arg Glu Asp Ala Val Cys 3505 3510 35153520 Val Ala Ala Val Asn Gly Pro Arg Ser Val Val Ile Ser Gly Ala Glu3525 3530 3535 Glu Ala Val Ala Glu Ala Ala Ala Gln Leu Ala Gly Arg GlyArg Arg 3540 3545 3550 Thr Arg Arg Leu Arg Val Ala His Ala Phe His SerPro Leu Met Asp 3555 3560 3565 Gly Met Leu Ala Gly Phe Arg Glu Val AlaAla Gly Leu Arg Tyr Arg 3570 3575 3580 Glu Pro Glu Leu Thr Val Val SerThr Val Thr Gly Arg Pro Ala Arg 3585 3590 3595 3600 Pro Gly Glu Leu ThrGly Pro Asp Tyr Trp Val Ala Gln Val Arg Glu 3605 3610 3615 Pro Val ArgPhe Ala Asp Ala Val Arg Thr Ala His Arg Leu Gly Ala 3620 3625 3630 ArgThr Phe Leu Glu Thr Gly Pro Asp Gly Val Leu Cys Gly Met Ala 3635 36403645 Glu Glu Cys Leu Glu Asp Asp Thr Val Ala Leu Leu Pro Ala Ile His3650 3655 3660 Lys Pro Gly Thr Ala Pro His Gly Pro Ala Ala Pro Gly AlaLeu Arg 3665 3670 3675 3680 Ala Ala Ala Ala Ala Tyr Gly Arg Gly Ala ArgVal Asp Trp Ala Gly 3685 3690 3695 Met His Ala Asp Gly Pro Glu Gly ProAla Arg Arg Val Glu Leu Pro 3700 3705 3710 Val His Ala Phe Arg His ArgArg Tyr Trp Leu Ala Pro Gly Arg Ala 3715 3720 3725 Ala Asp Thr Asp AspTrp Met Tyr Arg Ile Gly Trp Asp Arg Leu Pro 3730 3735 3740 Ala Val ThrGly Gly Ala Arg Thr Ala Gly Arg Trp Leu Val Ile His 3745 3750 3755 3760Pro Asp Ser Pro Arg Cys Arg Glu Leu Ser Gly His Ala Glu Arg Ala 37653770 3775 Leu Arg Ala Ala Gly Ala Ser Pro Val Pro Leu Pro Val Asp AlaPro 3780 3785 3790 Ala Ala Asp Arg Ala Ser Phe Ala Ala Leu Leu Arg SerAla Thr Gly 3795 3800 3805 Pro Asp Thr Arg Gly Asp Thr Ala Ala Pro ValAla Gly Val Leu Ser 3810 3815 3820 Leu Leu Ser Glu Glu Asp Arg Pro HisArg Gln His Ala Pro Val Pro 3825 3830 3835 3840 Ala Gly Val Leu Ala ThrLeu Ser Leu Met Gln Ala Met Glu Glu Glu 3845 3850 3855 Ala Val Glu AlaArg Val Trp Cys Val Ser Arg Ala Ala Val Ala Ala 3860 3865 3870 Ala AspArg Glu Arg Pro Val Gly Ala Gly Ala Ala Leu Trp Gly Leu 3875 3880 3885Gly Arg Val Ala Ala Leu Glu Arg Pro Thr Arg Trp Gly Gly Leu Val 38903895 3900 Asp Leu Pro Ala Ser Pro Gly Ala Ala His Trp Ala Ala Ala ValGlu 3905 3910 3915 3920 Arg Leu Ala Gly Pro Glu Asp Gln Ile Ala Val ArgAla Ser Gly Ser 3925 3930 3935 Trp Gly Arg Arg Leu Thr Arg Leu Pro ArgAsp Gly Gly Gly Arg Thr 3940 3945 3950 Ala Ala Pro Ala Tyr Arg Pro ArgGly Thr Val Leu Val Thr Gly Gly 3955 3960 3965 Thr Gly Ala Leu Gly GlyHis Leu Ala Arg Trp Leu Ala Ala Ala Gly 3970 3975 3980 Ala Glu His LeuAla Leu Thr Ser Arg Arg Gly Pro Asp Ala Pro Gly 3985 3990 3995 4000 AlaAla Gly Leu Glu Ala Glu Leu Leu Leu Leu Gly Ala Lys Val Thr 4005 40104015 Phe Ala Ala Cys Asp Thr Ala Asp Arg Asp Gly Leu Ala Arg Val Leu4020 4025 4030 Arg Ala Ile Pro Glu Asp Thr Pro Leu Thr Ala Val Phe HisAla Ala 4035 4040 4045 Gly Val Pro Gln Val Thr Pro Leu Ser Arg Thr SerPro Glu His Phe 4050 4055 4060 Ala Asp Val Tyr Ala Gly Lys Ala Ala GlyAla Ala His Leu Asp Glu 4065 4070 4075 4080 Leu Thr Arg Glu Leu Gly AlaGly Leu Asp Ala Phe Val Leu Tyr Ser 4085 4090 4095 Ser Gly Ala Gly ValTrp Gly Ser Ala Gly Gln Gly Ala Tyr Ala Ala 4100 4105 4110 Ala Asn AlaAla Leu Asp Ala Leu Ala Arg Arg Arg Ala Ala Asp Gly 4115 4120 4125 LeuPro Ala Thr Ser Ile Ala Trp Gly Val Trp Gly Gly Gly Gly Met 4130 41354140 Gly Ala Asp Glu Ala Gly Ala Glu Tyr Leu Gly Arg Arg Gly Met Arg4145 4150 4155 4160 Pro Met Ala Pro Val Ser Ala Leu Arg Ala Met Ala ThrAla Ile Ala 4165 4170 4175 Ser Gly Glu Pro Cys Pro Thr Val Thr His ThrAsp Trp Glu Arg Phe 4180 4185 4190 Gly Glu Gly Phe Thr Ala Phe Arg ProSer Pro Leu Ile Ala Gly Leu 4195 4200 4205 Gly Thr Pro Gly Gly Gly ArgAla Ala Glu Thr Pro Glu Glu Gly Asn 4210 4215 4220 Ala Thr Ala Ala AlaAsp Leu Thr Ala Leu Pro Pro Ala Glu Leu Arg 4225 4230 4235 4240 Thr AlaLeu Arg Glu Leu Val Arg Ala Arg Thr Ala Ala Ala Leu Gly 4245 4250 4255Leu Asp Asp Pro Ala Glu Val Ala Glu Gly Glu Arg Phe Pro Ala Met 42604265 4270 Gly Phe Asp Ser Leu Ala Thr Val Arg Leu Arg Arg Gly Leu AlaSer 4275 4280 4285 Ala Thr Gly Leu Asp Leu Pro Pro Asp Leu Leu Phe AspArg Asp Thr 4290 4295 4300 Pro Ala Ala Leu Ala Ala His Leu Ala Glu LeuLeu Ala Thr Ala Arg 4305 4310 4315 4320 Asp His Gly Pro Gly Gly Pro GlyThr Gly Ala Ala Pro Ala Asp Ala 4325 4330 4335 Gly Ser Gly Leu Pro AlaLeu Tyr Arg Glu Ala Val Arg Thr Gly Arg 4340 4345 4350 Ala Ala Glu MetAla Glu Leu Leu Ala Ala Ala Ser Arg Phe Arg Pro 4355 4360 4365 Ala PheGly Thr Ala Asp Arg Gln Pro Val Ala Leu Val Pro Leu Ala 4370 4375 4380Asp Gly Ala Glu Asp Thr Gly Leu Pro Leu Leu Val Gly Cys Ala Gly 43854390 4395 4400 Thr Ala Val Ala Ser Gly Pro Val Glu Phe Thr Ala Phe AlaGly Ala 4405 4410 4415 Leu Ala Asp Leu Pro Ala Ala Ala Pro Met Ala AlaLeu Pro Gln Pro 4420 4425 4430 Gly Phe Leu Pro Gly Glu Arg Val Pro AlaThr Pro Glu Ala Leu Phe 4435 4440 4445 Glu Ala Gln Ala Glu Ala Leu LeuArg Tyr Ala Ala Gly Arg Pro Phe 4450 4455 4460 Val Leu Leu Gly His SerAla Gly Ala Asn Met Ala His Ala Leu Thr 4465 4470 4475 4480 Arg His LeuGlu Ala Asn Gly Gly Gly Pro Ala Gly Leu Val Leu Met 4485 4490 4495 AspIle Tyr Thr Pro Ala Asp Pro Gly Ala Met Gly Val Trp Arg Asn 4500 45054510 Asp Met Phe Gln Trp Val Trp Arg Arg Ser Asp Ile Pro Pro Asp Asp4515 4520 4525 His Arg Leu Thr Ala Met Gly Ala Tyr His Arg Leu Leu LeuAsp Trp 4530 4535 4540 Ser Pro Thr Pro Val Arg Ala Pro Val Leu His LeuArg Ala Ala Glu 4545 4550 4555 4560 Pro Met Gly Asp Trp Pro Pro Gly AspThr Gly Trp Gln Ser His Trp 4565 4570 4575 Asp Gly Ala His Thr Thr AlaGly Ile Pro Gly Asn His Phe Thr Met 4580 4585 4590 Met Thr Glu His AlaSer Ala Ala Ala Arg Leu Val His Gly Trp Leu 4595 4600 4605 Ala Glu ArgThr Pro Ser Gly Gln Gly Gly Ser Pro Ser Arg Ala Ala 4610 4615 4620 GlyArg Glu Glu Arg Pro Met Ile Leu Arg Ala Gly Thr Ala Asp Pro 4625 46304635 4640 Ala Pro Tyr Glu Glu Glu Ile Pro Gly Tyr Arg Ala Arg Ile LeuAsn 4645 4650 4655 Met Ser Asn Lys Asn Asn Asp Glu Leu Gln Arg Gln AlaSer Glu Asn 4660 4665 4670 Thr Leu Gly Leu Asn Pro Val Ile Gly Ile ArgArg Lys Asp Leu Leu 4675 4680 4685 Ser Ser Ala Arg Thr Val Leu Arg GlnAla Val Arg Gln Pro Leu His 4690 4695 4700 Ser Ala Lys His Val Ala HisPhe Gly Leu Glu Leu Lys Asn Val Leu 4705 4710 4715 4720 Leu Gly Lys SerSer Leu Ala Pro Glu Ser Asp Asp Arg Arg Phe Asn 4725 4730 4735 Asp ProAla Trp Ser Asn Asn Pro Leu Tyr Arg Arg Tyr Leu Gln Thr 4740 4745 4750Tyr Leu Ala Trp Arg Lys Glu Leu Gln Asp Trp Ile Gly Asn Ser Asp 47554760 4765 Leu Ser Pro Gln Asp Ile Ser Arg Gly Gln Phe Val Ile Asn LeuMet 4770 4775 4780 Thr Glu Ala Met Ala Pro Thr Asn Thr Leu Ser Asn ProAla Ala Val 4785 4790 4795 4800 Lys Arg Phe Phe Glu Thr Gly Gly Lys SerLeu Leu Asp Gly Leu Ser 4805 4810 4815 Asn Leu Ala Lys Asp Leu Val AsnAsn Gly Gly Met Pro Ser Gln Val 4820 4825 4830 Asn Met Asp Ala Phe GluVal Gly Lys Asn Leu Gly Thr Ser Glu Gly 4835 4840 4845 Ala Val Val TyrArg Asn Asp Val Leu Glu Leu Ile Gln Tyr Lys Pro 4850 4855 4860 Ile ThrGlu Gln Val His Ala Arg Pro Leu Leu Val Val Pro Pro Gln 4865 4870 48754880 Ile Asn Lys Phe Tyr Val Phe Asp Leu Ser Pro Glu Lys Ser Leu Ala4885 4890 4895 Arg Tyr Cys Leu Arg Ser Gln Gln Gln Thr Phe Ile Ile SerTrp Arg 4900 4905 4910 Asn Pro Thr Lys Ala Gln Arg Glu Trp Gly Leu SerThr Tyr Ile Asp 4915 4920 4925 Ala Leu Lys Glu Ala Val Asp Ala Val LeuAla Ile Thr Gly Ser Lys 4930 4935 4940 Asp Leu Asn Met Leu Gly Ala CysSer Gly Gly Ile Thr Cys Thr Ala 4945 4950 4955 4960 Leu Val Gly His TyrAla Ala Leu Gly Glu Asn Lys Val Asn Ala Leu 4965 4970 4975 Thr Leu LeuVal Ser Val Leu Asp Thr Thr Met Asp Asn Gln Val Ala 4980 4985 4990 LeuPhe Val Asp Glu Gln Thr Leu Glu Ala Ala Lys Arg His Ser Tyr 4995 50005005 Gln Ala Gly Val Leu Glu Gly Ser Glu Met Ala Lys Val Phe Ala Trp5010 5015 5020 Met Arg Pro Asn Asp Leu Ile Trp Asn Tyr Trp Val Asn AsnTyr Leu 5025 5030 5035 5040 Leu Gly Asn Glu Pro Pro Val Phe Asp Ile LeuPhe Trp Asn Asn Asp 5045 5050 5055 Thr Thr Arg Leu Pro Ala Ala Phe HisGly Asp Leu Ile Glu Met Phe 5060 5065 5070 Lys Ser Asn Pro Leu Thr ArgPro Asp Ala Leu Glu Val Cys Gly Thr 5075 5080 5085 Pro Ile Asp Leu LysGln Val Lys Cys Asp Ile Tyr Ser Leu Ala Gly 5090 5095 5100 Thr Asn AspHis Ile Thr Pro Trp Gln Ser Cys Tyr Arg Ser Ala His 5105 5110 5115 5120Leu Phe Gly Gly Lys Ile Glu Phe Val Leu Ser Asn Ser Gly His Ile 51255130 5135 Gln Ser Ile Leu Asn Pro Pro Gly Asn Pro Lys Ala Arg Phe MetThr 5140 5145 5150 Gly Ala Asp Arg Pro Gly Asp Pro Val Ala Trp Gln GluAsn Ala Thr 5155 5160 5165 Lys His Ala Asp Ser Trp Trp Leu His Trp GlnSer Trp Leu Gly Glu 5170 5175 5180 Arg Ala Gly Glu Leu Glu Lys Ala ProThr Arg Leu Gly Asn Arg Ala 5185 5190 5195 5200 Tyr Ala Ala Gly Glu AlaSer Pro Gly Thr Tyr Val His Glu Arg 5205 5210 5215 3 13613 DNAStreptomyces venezuelae 3 ggatccggcg cttccacccc gcgccgaaca gcgcggtgcggctggtctgc ctgccgcacg 60 ccggcggctc cgccagctac ttcttccgct tctcggaggagctgcacccc tccgtcgagg 120 ccctgtcggt gcagtatccg ggccgccagg accggcgtgccgagccgtgt ctggagagcg 180 tcgaggagct cgccgagcat gtggtcgcgg ccaccgaaccctggtggcag gagggccggc 240 tggccttctt cgggcacagc ctcggcgcct ccgtcgccttcgagacggcc cgcatcctgg 300 aacagcggca cggggtacgg cccgagggcc tgtacgtctccggtcggcgc gccccgtcgc 360 tggcgccgga ccggctcgtc caccagctgg acgaccgggcgttcctggcc gagatccggc 420 ggctcagcgg caccgacgag cggttcctcc aggacgacgagctgctgcgg ctggtgctgc 480 ccgcgctgcg cagcgactac aaggcggcgg agacgtacctgcaccggccg tccgccaagc 540 tcacctgccc ggtgatggcc ctggccggcg accgtgacccgaaggcgccg ctgaacgagg 600 tggccgagtg gcgtcggcac accagcgggc cgttctgcctccgggcgtac tccggcggcc 660 acttctacct caacgaccag tggcacgaga tctgcaacgacatctccgac cacctgctcg 720 tcacccgcgg cgcgcccgat gcccgcgtcg tgcagcccccgaccagcctt atcgaaggag 780 cggcgaagag atggcagaac ccacggtgac cgacgacctgacgggggccc tcacgcagcc 840 cccgctgggc cgcaccgtcc gcgcggtggc cgaccgtgaactcggcaccc acctcctgga 900 gacccgcggc atccactgga tccacgccgc gaacggcgacccgtacgcca ccgtgctgcg 960 cggccaggcg gacgacccgt atcccgcgta cgagcgggtgcgtgcccgcg gcgcgctctc 1020 cttcagcccg acgggcagct gggtcaccgc cgatcacgccctggcggcga gcatcctctg 1080 ctcgacggac ttcggggtct ccggcgccga cggcgtcccggtgccgcagc aggtcctctc 1140 gtacggggag ggctgtccgc tggagcgcga gcaggtgctgccggcggccg gtgacgtgcc 1200 ggagggcggg cagcgtgccg tggtcgaggg gatccaccgggagacgctgg agggtctcgc 1260 gccggacccg tcggcgtcgt acgccttcga gctgctgggcggtttcgtcc gcccggcggt 1320 gacggccgct gccgccgccg tgctgggtgt tcccgcggaccggcgcgcgg acttcgcgga 1380 tctgctggag cggctccggc cgctgtccga cagcctgctggccccgcagt ccctgcggac 1440 ggtacgggcg gcggacggcg cgctggccga gctcacggcgctgctcgccg attcggacga 1500 ctcccccggg gccctgctgt cggcgctcgg ggtcaccgcagccgtccagc tcaccgggaa 1560 cgcggtgctc gcgctcctcg cgcatcccga gcagtggcgggagctgtgcg accggcccgg 1620 gctcgcggcg gccgcggtgg aggagaccct ccgctacgacccgccggtgc agctcgacgc 1680 ccgggtggtc cgcggggaga cggagctggc gggccggcggctgccggccg gggcgcatgt 1740 cgtcgtcctg accgccgcga ccggccggga cccggaggtcttcacggacc cggagcgctt 1800 cgacctcgcg cgccccgacg ccgccgcgca cctcgcgctgcaccccgccg gtccgtacgg 1860 cccggtggcg tccctggtcc ggcttcaggc ggaggtcgcgctgcggaccc tggccgggcg 1920 tttccccggg ctgcggcagg cgggggacgt gctccgcccccgccgcgcgc ctgtcggccg 1980 cgggccgctg agcgtcccgg tcagcagctc ctgagacaccggggccccgg tccgcccggc 2040 cccccttcgg acggaccgga cggctcggac cacggggacggctcagaccg tcccgtgtgt 2100 ccccgtccgg ctcccgtccg ccccatcccg cccctccaccggcaaggaag gacacgacgc 2160 catgcgcgtc ctgctgacct cgttcgcaca tcacacgcactactacggcc tggtgcccct 2220 ggcctgggcg ctgctcgccg ccgggcacga ggtgcgggtcgccagccagc ccgcgctcac 2280 ggacaccatc accgggtccg ggctcgccgc ggtgccggtcggcaccgacc acctcatcca 2340 cgagtaccgg gtgcggatgg cgggcgagcc gcgcccgaaccatccggcga tcgccttcga 2400 cgaggcccgt cccgagccgc tggactggga ccacgccctcggcatcgagg cgatcctcgc 2460 cccgtacttc catctgctcg ccaacaacga ctcgatggtcgacgacctcg tcgacttcgc 2520 ccggtcctgg cagccggacc tggtgctgtg ggagccgacgacctacgcgg gcgccgtcgc 2580 cgcccaggtc accggtgccg cgcacgcccg ggtcctgtgggggcccgacg tgatgggcag 2640 cgcccgccgc aagttcgtcg cgctgcggga ccggcagccgcccgagcacc gcgaggaccc 2700 caccgcggag tggctgacgt ggacgctcga ccggtacggcgcctccttcg aagaggagct 2760 gctcaccggc cagttcacga tcgacccgac cccgccgagcctgcgcctcg acacgggcct 2820 gccgaccgtc gggatgcgtt atgttccgta caacggcacgtcggtcgtgc cggactggct 2880 gagtgagccg cccgcgcggc cccgggtctg cctgaccctcggcgtctccg cgcgtgaggt 2940 cctcggcggc gacggcgtct cgcagggcga catcctggaggcgctcgccg acctcgacat 3000 cgagctcgtc gccacgctcg acgcgagtca gcgcgccgagatccgcaact acccgaagca 3060 cacccggttc acggacttcg tgccgatgca cgcgctcctgccgagctgct cggcgatcat 3120 ccaccacggc ggggcgggca cctacgcgac cgccgtgatcaacgcggtgc cgcaggtcat 3180 gctcgccgag ctgtgggacg cgccggtcaa ggcgcgggccgtcgccgagc agggggcggg 3240 gttcttcctg ccgccggccg agctcacgcc gcaggccgtgcgggacgccg tcgtccgcat 3300 cctcgacgac ccctcggtcg ccaccgccgc gcaccggctgcgcgaggaga ccttcggcga 3360 ccccaccccg gccgggatcg tccccgagct ggagcggctcgccgcgcagc accgccgccc 3420 gccggccgac gcccggcact gagccgcacc cctcgccccaggcctcaccc ctgtatctgc 3480 gccgggggac gcccccggcc caccctccga aagaccgaaagcaggagcac cgtgtacgaa 3540 gtcgaccacg ccgacgtcta cgacctcttc tacctgggtcgcggcaagga ctacgccgcc 3600 gaggcctccg acatcgccga cctggtgcgc tcccgtacccccgaggcctc ctcgctcctg 3660 gacgtggcct gcggtacggg cacgcatctg gagcacttcaccaaggagtt cggcgacacc 3720 gccggcctgg agctgtccga ggacatgctc acccacgcccgcaagcggct gcccgacgcc 3780 acgctccacc agggcgacat gcgggacttc cggctcggccggaagttctc cgccgtggtc 3840 agcatgttca gctccgtcgg ctacctgaag acgaccgaggaactcggcgc ggccgtcgcc 3900 tcgttcgcgg agcacctgga gcccggtggc gtcgtcgtcgtcgagccgtg gtggttcccg 3960 gagaccttcg ccgacggctg ggtcagcgcc gacgtcgtccgccgtgacgg gcgcaccgtg 4020 gcccgtgtct cgcactcggt gcgggagggg aacgcgacgcgcatggaggt ccacttcacc 4080 gtggccgacc cgggcaaggg cgtgcggcac ttctccgacgtccatctcat caccctgttc 4140 caccaggccg agtacgaggc cgcgttcacg gccgccgggctgcgcgtcga gtacctggag 4200 ggcggcccgt cgggccgtgg cctcttcgtc ggcgtccccgcctgagcacc gcccaagacc 4260 ccccggggcg ggacgtcccg ggtgcaccaa gcaaagagagagaaacgaac cgtgacaggt 4320 aagacccgaa taccgcgtgt ccgccgcggc cgcaccacgcccagggcctt caccctggcc 4380 gtcgtcggca ccctgctggc gggcaccacc gtggcggccgccgctcccgg cgccgccgac 4440 acggccaatg ttcagtacac gagccgggcg gcggagctcgtcgcccagat gacgctcgac 4500 gagaagatca gcttcgtcca ctgggcgctg gaccccgaccggcagaacgt cggctacctt 4560 cccggcgtgc cgcgtctggg catcccggag ctgcgtgccgccgacggccc gaacggcatc 4620 cgcctggtgg ggcagaccgc caccgcgctg cccgcgccggtcgccctggc cagcaccttc 4680 gacgacacca tggccgacag ctacggcaag gtcatgggccgcgacggtcg cgcgctcaac 4740 caggacatgg tcctgggccc gatgatgaac aacatccgggtgccgcacgg cggccggaac 4800 tacgagacct tcagcgagga ccccctggtc tcctcgcgcaccgcggtcgc ccagatcaag 4860 ggcatccagg gtgcgggtct gatgaccacg gccaagcacttcgcggccaa caaccaggag 4920 aacaaccgct tctccgtgaa cgccaatgtc gacgagcagacgctccgcga gatcgagttc 4980 ccggcgttcg aggcgtcctc caaggccggc gcggcctccttcatgtgtgc ctacaacggc 5040 ctcaacggga agccgtcctg cggcaacgac gagctcctcaacaacgtgct gcgcacgcag 5100 tggggcttcc agggctgggt gatgtccgac tggctcgccaccccgggcac cgacgccatc 5160 accaagggcc tcgaccagga gatgggcgtc gagctccccggcgacgtccc gaagggcgag 5220 ccctcgccgc cggccaagtt cttcggcgag gcgctgaagacggccgtcct gaacggcacg 5280 gtccccgagg cggccgtgac gcggtcggcg gagcggatcgtcggccagat ggagaagttc 5340 ggtctgctcc tcgccactcc ggcgccgcgg cccgagcgcgacaaggcggg tgcccaggcg 5400 gtgtcccgca aggtcgccga gaacggcgcg gtgctcctgcgcaacgaggg ccaggccctg 5460 ccgctcgccg gtgacgccgg caagagcatc gcggtcatcggcccgacggc cgtcgacccc 5520 aaggtcaccg gcctgggcag cgcccacgtc gtcccggactcggcggcggc gccactcgac 5580 accatcaagg cccgcgcggg tgcgggtgcg acggtgacgtacgagacggg tgaggagacc 5640 ttcgggacgc agatcccggc ggggaacctc agcccggcgttcaaccaggg ccaccagctc 5700 gagccgggca aggcgggggc gctgtacgac ggcacgctgaccgtgcccgc cgacggcgag 5760 taccgcatcg cggtccgtgc caccggtggt tacgccacggtgcagctcgg cagccacacc 5820 atcgaggccg gtcaggtcta cggcaaggtg agcagcccgctcctcaagct gaccaagggc 5880 acgcacaagc tcacgatctc gggcttcgcg atgagtgccaccccgctctc cctggagctg 5940 ggctgggtga cgccggcggc ggccgacgcg acgatcgcgaaggccgtgga gtcggcgcgg 6000 aaggcccgta cggcggtcgt cttcgcctac gacgacggcaccgagggcgt cgaccgtccg 6060 aacctgtcgc tgccgggtac gcaggacaag ctgatctcggctgtcgcgga cgccaacccg 6120 aacacgatcg tggtcctcaa caccggttcg tcggtgctgatgccgtggct gtccaagacc 6180 cgcgcggtcc tggacatgtg gtacccgggc caggcgggcgccgaggccac cgccgcgctg 6240 ctctacggtg acgtcaaccc gagcggcaag ctcacgcagagcttcccggc cgccgagaac 6300 cagcacgcgg tcgccggcga cccgacaagc tacccgggcgtcgacaacca gcagacgtac 6360 cgcgagggca tccacgtcgg gtaccgctgg ttcgacaaggagaacgtcaa gccgctgttc 6420 ccgttcgggc acggcctgtc gtacacctcg ttcacgcagagcgccccgac cgtcgtgcgt 6480 acgtccacgg gtggtctgaa ggtcacggtc acggtccgcaacagcgggaa gcgcgccggc 6540 caggaggtcg tccaggcgta cctcggtgcc agcccgaacgtgacggctcc gcaggcgaag 6600 aagaagctcg tgggctacac gaaggtctcg ctcgccgcgggcgaggcgaa gacggtgacg 6660 gtgaacgtcg accgccgtca gctgcagacc ggttcgtcctccgccgacct gcggggcagc 6720 gccacggtca acgtctggtg acgtgacgcc gtgaaagcggcggtgcccgc cacccgggag 6780 ggtggcgggc accgcttttt cggcctgctg ggtctaccggaccacctgac taggcctggt 6840 cgacccgctc ggcccattcg cgcacggcgt cgatcacccgcagcgcctgc gggcgctcca 6900 ggtgcgggcc gatcggcagg ctgaggacct gccgcgcgaagctctcggcc cgcgggagcg 6960 agccttccgg cggtgcctcg cccgcgtagg cgggcgagaggtgcacgggt accgggtagt 7020 gcgtgagggt gtcgatgccg cgggcgtcga ggtggctgcgcagctcgtcg cggcgctcgg 7080 tgcgcacggt gaagaggtgc cagaccgggt cggtgtcgggcgcggtcacc ggcaggccga 7140 tgccgggcag tccggcgagc ccggagaggt actccgcggccagcgccgac ctgcggccgt 7200 tccagctgtc caggtgggcg agccggatcc gcagcacggcggcctgcatc tcgtccaggc 7260 gggagttggt gcccttcgtc tcgtggctgt acttctgccgcgagccgtag ttgcggagca 7320 tccggagccg ttcggcgagc tcggggtcgc cggtgacgacggcgccgccg tcgccgaagc 7380 agccgaggtt cttgcccggg tagaagctga acgcggccaccgacgacccg gcgccgatcc 7440 gccggccccg gtagcgggcg ccgtgggcct gcgcggcgtcctcgacgatg tgcaggccgt 7500 gccggtccgc gagctcgcgg agggcgtcca tgtcggcggggtgcccgtag aggtggacgg 7560 ggaggagcgc ccgggtgcgg ggggtgatcg ccttctcgacgagcagcggg tccagggtgg 7620 ggtggtcctc gtgcggctcg acgggcacgg gggtcgcgccggtggcggac accgcgagcc 7680 agctggcgat gtacgtgtgc gaggggacga tcacctcgtccccgggtccg atgccgaggc 7740 cgcggagggc gagctggagg gcgtccatcc cgctgttcacgccgacggcg tggtccgtct 7800 cgcagtacgc ggcgaactcc gcctcgaatc cttcgagttcgggtccgagg aggtagcgcc 7860 ccgagtcgag gacgcgggcg atcgcggcgt cggtctccgcgcggagctcc tcgtaggcgg 7920 ccttgaggtc gaggaagggg acgcgggggg tctcggcgcggctgctcacg cggacacctc 7980 cacggcggtg gcgggcagct gcggggcggt cgccttgagcggctcccacc agccgcggtt 8040 ctcccggtac cagcggacgg tccgcgcgag gccgtccgcgaaggagacct gcgggcggta 8100 gccgagctcg cgctcgatct cgccgccgtc gagggagtagcgcaggtcgt ggcccttgcg 8160 gtcggcgacc ttccggaccg aggaccagtc ggcgccgagcgagtccagga ggatgccggt 8220 gagttcgcgg ttggtcagct ccaggccgcc gccgatgtggtagatctcgc cggcccggcc 8280 gcccgcgagg acgagcgcga tgccccggca gtggtcgtcggtgtgcaccc actcgcggac 8340 gttcgcgccg tcgccgtaca gcgggagcgt cccgccgtcgaggaggttcg tcacgaagag 8400 ggggatgagc ttctcggggt gctggtacgg cccgtagttgttgcagcagc gggtgatccg 8460 tacgtcgagg ccgtacgtcc ggtggtaggc gcgggcaacgaggtcggagc cggccttgga 8520 cgccgcgtag ggcgagttgg gctccagcgg gctgctctcggtccaggagc cggagtcgat 8580 cgacccgtac acctcgtcgg tggagacgtg cacgacccggccgacgccgg cgtcgacggc 8640 gcactggagc agcgtctgcg tgccctgcac gttggtctcggtgaacacgg acgcgcccgc 8700 gatggagcgg tccacgtggc tctcggccgc gaagtggacgatggcgtcca cgccgcgcag 8760 ttcccgggcg aggaggccgg cgtcgcggat gtcgccgtggacgaagcgca gtcgcgggtc 8820 cgcgtccacc ggggcgaggt tggcgcggtt gcccgcgtaggtgaggctgt ccaggacgat 8880 cacctcatcg gcgggcacgt cggggtacgc cccggcgaggagctgccgca cgaagtgcga 8940 gccgatgaag cccgcacctc cggtcaccag aagccgcactgccgtcttcc tttcggtcgc 9000 gctgtaggtc gcggtgtggg tcgcactgtc ggtggcggtgcgggtcgcgg tgtgggtcgc 9060 actgtcggtg gcgctgtcgg tcgtgggaac gcgtcggccgcgaggtgccc tcacggggct 9120 ccctcgcggc cggcgatctc catcagatag ctgccgtactcggtgcggga gaggccttct 9180 cccaggccgt gacaggcctc ggcgtcgatg aagcccatgcggaaggcgat ctcctcaagg 9240 cccgcgatcc agacgccctg ccgctcctcc aggacctggacgtactgggc ggcccgcagg 9300 agcgagtcgt gggtgccggt gtccagccag gcgaagccgcggcccaggtt gacgagttcg 9360 gcccggcccc gctccaggta gacgcggttg acgtcggtgatctccagctc gccgcgcggc 9420 gagggccgga tgttcttggc gatgtcgacg acgtcgttgtcgtagaggta gaggccggtg 9480 acggcgaggt tggagcgcgg cttgacgggc ttctcgacgaggtcggtcag ccggcccgtc 9540 gcgtccacct cggcgacgcc gtaccgctcg gggtccttgaccgggtagcc gaagagcacg 9600 cagccgtcga ggcgcgcgat gctgtcccgc aggagcgtgtagaggccggg cccgtggaag 9660 atgttgtcgc ccaggatcag ggcgcaggtg tcgtcgccgatgtgctcggc tccgacgaga 9720 agtgcgtccg cgattcctgc gggctctttc tggaccgcatagtcgagttc tattcccagg 9780 tgcctgccgt ttccgagaag cgactggaag agttcgatgtgctggggggt cgagatgatt 9840 tgaatctcgc gaataccgcc gagcatgaga accgacagcggatagtagat catcggtttg 9900 ttgtagaccg gaagaatctg cttcgaaatg accgaggtcgccggatgcag ccgagttccg 9960 ctcccgccgg ccaggactat tcccttcatt ctcggaaactagcagcaggg cgccggtgat 10020 aacggtcggc gtggcgagtt aggggggcgc taggggctgcgcagggggag tgtcaccacc 10080 cctttggggg gtgggaaaac accgagggcc cggccggacggccgggccct caggtggggg 10140 gatcgtgggg gggggatcgg ggggatcggg gcgggtgcgggtcagcgcag gaagccgcgg 10200 gcctcctccc agccgtccgc ggcgtcgcgc tccagctggttcaggcgggc ggtgacgacc 10260 tgatcgaagc cgtccatgaa gtactcgtcg ccgtcgacggccgccacctc gccgccgcgc 10320 tcgacgaagt ccctgacgac ctcggtgagg gaggtgtcgggggtcacgcg gcccgcgatg 10380 tagcgggtcg cgccgtccag gtcggggaag ccggcctcgcggtacaggta cacgtcgccg 10440 aggagatcga cctgcaccgc gacctgcggg tgcgcggtgggccgcatggt ggcgggcttg 10500 atccgcagca gttcggcgtc ggccccggtg cgcaggctgttcagggcgta gccgtagtcg 10560 atgtggagtc cgggggtgcg ctcgcggacc cgctcctcgaaggcgttgag ggcctcctgg 10620 agctcggccc gctcctcctg cggcagcttg ccgtcgtcacggccgctgta gtcctcgcga 10680 atgttgacga agtcgatcgt cctgccctgc ccggcgtcgttgaggtcggc gatgaagtcg 10740 accaggtcga gcaggcggga ggcacggccc gggagcacgatgtaggcgaa gccgaggttg 10800 atcggcgact cgcgctcggc gcgcagctgc tggaagcggcgcaggttctc gcggacgcgg 10860 cggaaggcgg ccttcttgcc ggtggtctgc tcgtactcctcgtcgttgag gccgtagagc 10920 gaggtgcgga tggcgtgcag gccccagagg ccgggctggcgctccagggt gcgctcggtg 10980 agcgcgaagg agttcgtgta gacggtgggc cgcaggccgtggtcggtggc gtgcgcggcc 11040 aggctcccga ggccggggtt ggtgagcggc tccaggccgccggagaagta catcgccgag 11100 gggttgcccg cgggtatctc gtcgatgacc gaccggaacatggcgttgcc ggcgtcgagg 11160 gcggacgggt cgtagcgggc gccggtcaca cggacgcagaagtggcagcg gaacatgcag 11220 gtcgggccgg ggtagaggcc gacgctgtac gggaagacgggcttcctggc gagcgccgcg 11280 tcgaagacgc cgcgctgttc gagcgggagc agggtgttcttccagtacgc cccggcgggg 11340 ccggtctcga ccgcggtgcg gagctccggg acctgcccgaacagggcgag gaggcgccgg 11400 aaggcgtccc ggtcgacgcc caggtcgtgg cgggcctcctccagcggggt gaaggggctg 11460 ttgccgtagc gcacggcgag ccggacgagg tggcgggcggtcgttccggc ctcgtcgggc 11520 ggcacgaggc cgccggcggc gagggtctgg ccgacggcgtggaccgccgc ccccagatcg 11580 gctccggggt gcgcgcagcg ttcggccggg gcggtggcggaaagggcggg ggcggtcatc 11640 gggagcgtcc aatcgtgggc gtggatgtct ggggggccgcgagcggggcg ggggccgtgt 11700 cgcggtggcg cgcggtcagt tcgcggccgc gggtcgcgcagagacgcagc aggtcggcga 11760 cccggcggat gtcgtcgtcg ccgatggcgg tgccggtcggcagggacagc acgcgcgcgg 11820 cgaggcgttc ggtgtgcggc agcggggcgt gcggctgcccgcggtacggc tccagctcgt 11880 ggcagcccgg cgagaagtag gcgcgggtgt gcacgccttcggccttcagg acctccatga 11940 cgaggtcgcg gtggatgccg gtggtggcct cgtcgatctcgacgatcacg tactggtggt 12000 tgttgaggcc gtggcggtcg tggtcggcga cgaggacgccggggaggtcc gcgaggtgct 12060 cgcggtaggc ggcgtggttg cgccggttcc ggtcgatgacctcgggaaac gcgtcgaggg 12120 aggtgaggcc catggcggcg gcggcctcgc tcatcttggcgttggtcccg ccggcggggc 12180 tgccgccggg caggtcgaag ccgaagttgt ggagggcgcggatccgggcg gcgaggtcgg 12240 cgtcgtcggt gacgacggcg ccgccctcga aggcgttgacggccttggtg gcgtggaagc 12300 tgaagacctc ggcgtcgccg aggctgccgg cgggccggccgtcgaccgcg cagccgaggg 12360 cgtgcgcggc gtcgaagtac agccgcaggc cgtgctcgtcggcgaccttc cgcagctggt 12420 cggcggcgca ggggcggccc cagaggtgga cgccgacgacggccgaggtg cggggtgtga 12480 ccgcggcggc cacctggtcc gggtcgaggt tgccggtgtccgggtcgatg tcggcgaaga 12540 ccggggtgag gccgatccag cgcagtgcgt gcggggtggcggcgaacgtc atcgacggca 12600 tgatcacttc gccggtgagg ccggcggcgt gcgcgaggagctggagcccg gccgtggcgt 12660 tgcaggtggc cacggcatgc cggaccccgg cgagcccggcgacgcgctcc tcgaactcgc 12720 ggacgagcgg gccgccgttg gacagccact ggctgtcgagggcccggtcg agccgctcgt 12780 acagcctggc gcggtcgatg cggttgggcc gccccacgaggagcggctgg tcgaaagcgg 12840 cggggccgcc gaagaatgcg aggtcggata aggcgcttttcacggatgtt ccctccgggc 12900 caccgtcacg aaatgattcg ccgatccggg aatcccgaacgaggtcgccg cgctccaccg 12960 tgacgtacga cgagatggtc gattgtggtg gtcgatttcggggggactct aatccgcgcg 13020 gaacgggacc gacaagagca cgctatgcgc tctcgatgtgcttcggatca catccgcctc 13080 cggggtattc catcggcggc ccgaatgtga tgatccttgacaggatccgg gaatcagccg 13140 agccgccggg agggccgggg cgcgctccgc ggaagagtacgtgtgagaag tcccgttcct 13200 cttcccgttt ccgttccgct tccggcccgg tctggagttctccgtgcgcc gtacccagca 13260 gggaacgacc gcttctcccc cggtactcga cctcggggccctggggcagg atttcgcggc 13320 cgatccgtat ccgacgtacg cgagactgcg tgccgagggtccggcccacc gggtgcgcac 13380 ccccgagggg gacgaggtgt ggctggtcgt cggctacgaccgggcgcggg cggtcctcgc 13440 cgatccccgg ttcagcaaga ctggcgcaac tccacgactcccctgaccga agccgaagcc 13500 gcgctcaacc acaacatgct gagttccgaa cccgccgcggcacacccggc tgcgccagct 13560 ggtggcccgt gagttcacca tgcgccggtg cgagttgctgccgccccggg tcc 13613 4 3782 PRT Streptomyces venezuelae 4 Met Thr AspAsp Leu Thr Gly Ala Leu Thr Gln Pro Pro Leu Gly Arg 1 5 10 15 Thr ValArg Ala Val Ala Asp Arg Glu Leu Gly Thr His Leu Leu Glu 20 25 30 Thr ArgGly Ile His Trp Ile His Ala Ala Asn Gly Asp Pro Tyr Ala 35 40 45 Thr ValLeu Arg Gly Gln Ala Asp Asp Pro Tyr Pro Ala Tyr Glu Arg 50 55 60 Val ArgAla Arg Gly Ala Leu Ser Phe Ser Pro Thr Gly Ser Trp Val 65 70 75 80 ThrAla Asp His Ala Leu Ala Ala Ser Ile Leu Cys Ser Thr Asp Phe 85 90 95 GlyVal Ser Gly Ala Asp Gly Val Pro Val Pro Gln Gln Val Leu Ser 100 105 110Tyr Gly Glu Gly Cys Pro Leu Glu Arg Glu Gln Val Leu Pro Ala Ala 115 120125 Gly Asp Val Pro Glu Gly Gly Gln Arg Ala Val Val Glu Gly Ile His 130135 140 Arg Glu Thr Leu Glu Gly Leu Ala Pro Asp Pro Ser Ala Ser Tyr Ala145 150 155 160 Phe Glu Leu Leu Gly Gly Phe Val Arg Pro Ala Val Thr AlaAla Ala 165 170 175 Ala Ala Val Leu Gly Val Pro Ala Asp Arg Arg Ala AspPhe Ala Asp 180 185 190 Leu Leu Glu Arg Leu Arg Pro Leu Ser Asp Ser LeuLeu Ala Pro Gln 195 200 205 Ser Leu Arg Thr Val Arg Ala Ala Asp Gly AlaLeu Ala Glu Leu Thr 210 215 220 Ala Leu Leu Ala Asp Ser Asp Asp Ser ProGly Ala Leu Leu Ser Ala 225 230 235 240 Leu Gly Val Thr Ala Ala Val GlnLeu Thr Gly Asn Ala Val Leu Ala 245 250 255 Leu Leu Ala His Pro Glu GlnTrp Arg Glu Leu Cys Asp Arg Pro Gly 260 265 270 Leu Ala Ala Ala Ala ValGlu Glu Thr Leu Arg Tyr Asp Pro Pro Val 275 280 285 Gln Leu Asp Ala ArgVal Val Arg Gly Glu Thr Glu Leu Ala Gly Arg 290 295 300 Arg Leu Pro AlaGly Ala His Val Val Val Leu Thr Ala Ala Thr Gly 305 310 315 320 Arg AspPro Glu Val Phe Thr Asp Pro Glu Arg Phe Asp Leu Ala Arg 325 330 335 ProAsp Ala Ala Ala His Leu Ala Leu His Pro Ala Gly Pro Tyr Gly 340 345 350Pro Val Ala Ser Leu Val Arg Leu Gln Ala Glu Val Ala Leu Arg Thr 355 360365 Leu Ala Gly Arg Phe Pro Gly Leu Arg Gln Ala Gly Asp Val Leu Arg 370375 380 Pro Arg Arg Ala Pro Val Gly Arg Gly Pro Leu Ser Val Pro Val Ser385 390 395 400 Ser Ser Met Arg Val Leu Leu Thr Ser Phe Ala His His ThrHis Tyr 405 410 415 Tyr Gly Leu Val Pro Leu Ala Trp Ala Leu Leu Ala AlaGly His Glu 420 425 430 Val Arg Val Ala Ser Gln Pro Ala Leu Thr Asp ThrIle Thr Gly Ser 435 440 445 Gly Leu Ala Ala Val Pro Val Gly Thr Asp HisLeu Ile His Glu Tyr 450 455 460 Arg Val Arg Met Ala Gly Glu Pro Arg ProAsn His Pro Ala Ile Ala 465 470 475 480 Phe Asp Glu Ala Arg Pro Glu ProLeu Asp Trp Asp His Ala Leu Gly 485 490 495 Ile Glu Ala Ile Leu Ala ProTyr Phe His Leu Leu Ala Asn Asn Asp 500 505 510 Ser Met Val Asp Asp LeuVal Asp Phe Ala Arg Ser Trp Gln Pro Asp 515 520 525 Leu Val Leu Trp GluPro Thr Thr Tyr Ala Gly Ala Val Ala Ala Gln 530 535 540 Val Thr Gly AlaAla His Ala Arg Val Leu Trp Gly Pro Asp Val Met 545 550 555 560 Gly SerAla Arg Arg Lys Phe Val Ala Leu Arg Asp Arg Gln Pro Pro 565 570 575 GluHis Arg Glu Asp Pro Thr Ala Glu Trp Leu Thr Trp Thr Leu Asp 580 585 590Arg Tyr Gly Ala Ser Phe Glu Glu Glu Leu Leu Thr Gly Gln Phe Thr 595 600605 Ile Asp Pro Thr Pro Pro Ser Leu Arg Leu Asp Thr Gly Leu Pro Thr 610615 620 Val Gly Met Arg Tyr Val Pro Tyr Asn Gly Thr Ser Val Val Pro Asp625 630 635 640 Trp Leu Ser Glu Pro Pro Ala Arg Pro Arg Val Cys Leu ThrLeu Gly 645 650 655 Val Ser Ala Arg Glu Val Leu Gly Gly Asp Gly Val SerGln Gly Asp 660 665 670 Ile Leu Glu Ala Leu Ala Asp Leu Asp Ile Glu LeuVal Ala Thr Leu 675 680 685 Asp Ala Ser Gln Arg Ala Glu Ile Arg Asn TyrPro Lys His Thr Arg 690 695 700 Phe Thr Asp Phe Val Pro Met His Ala LeuLeu Pro Ser Cys Ser Ala 705 710 715 720 Ile Ile His His Gly Gly Ala GlyThr Tyr Ala Thr Ala Val Ile Asn 725 730 735 Ala Val Pro Gln Val Met LeuAla Glu Leu Trp Asp Ala Pro Val Lys 740 745 750 Ala Arg Ala Val Ala GluGln Gly Ala Gly Phe Phe Leu Pro Pro Ala 755 760 765 Glu Leu Thr Pro GlnAla Val Arg Asp Ala Val Val Arg Ile Leu Asp 770 775 780 Asp Pro Ser ValAla Thr Ala Ala His Arg Leu Arg Glu Glu Thr Phe 785 790 795 800 Gly AspPro Thr Pro Ala Gly Ile Val Pro Glu Leu Glu Arg Leu Ala 805 810 815 AlaGln His Arg Arg Pro Pro Ala Asp Ala Arg His Met Tyr Glu Val 820 825 830Asp His Ala Asp Val Tyr Asp Leu Phe Tyr Leu Gly Arg Gly Lys Asp 835 840845 Tyr Ala Ala Glu Ala Ser Asp Ile Ala Asp Leu Val Arg Ser Arg Thr 850855 860 Pro Glu Ala Ser Ser Leu Leu Asp Val Ala Cys Gly Thr Gly Thr His865 870 875 880 Leu Glu His Phe Thr Lys Glu Phe Gly Asp Thr Ala Gly LeuGlu Leu 885 890 895 Ser Glu Asp Met Leu Thr His Ala Arg Lys Arg Leu ProAsp Ala Thr 900 905 910 Leu His Gln Gly Asp Met Arg Asp Phe Arg Leu GlyArg Lys Phe Ser 915 920 925 Ala Val Val Ser Met Phe Ser Ser Val Gly TyrLeu Lys Thr Thr Glu 930 935 940 Glu Leu Gly Ala Ala Val Ala Ser Phe AlaGlu His Leu Glu Pro Gly 945 950 955 960 Gly Val Val Val Val Glu Pro TrpTrp Phe Pro Glu Thr Phe Ala Asp 965 970 975 Gly Trp Val Ser Ala Asp ValVal Arg Arg Asp Gly Arg Thr Val Ala 980 985 990 Arg Val Ser His Ser ValArg Glu Gly Asn Ala Thr Arg Met Glu Val 995 1000 1005 His Phe Thr ValAla Asp Pro Gly Lys Gly Val Arg His Phe Ser Asp 1010 1015 1020 Val HisLeu Ile Thr Leu Phe His Gln Ala Glu Tyr Glu Ala Ala Phe 1025 1030 10351040 Thr Ala Ala Gly Leu Arg Val Glu Tyr Leu Glu Gly Gly Pro Ser Gly1045 1050 1055 Arg Gly Leu Phe Val Gly Val Pro Ala Met Thr Gly Lys ThrArg Ile 1060 1065 1070 Pro Arg Val Arg Arg Gly Arg Thr Thr Pro Arg AlaPhe Thr Leu Ala 1075 1080 1085 Val Val Gly Thr Leu Leu Ala Gly Thr ThrVal Ala Ala Ala Ala Pro 1090 1095 1100 Gly Ala Ala Asp Thr Ala Asn ValGln Tyr Thr Ser Arg Ala Ala Glu 1105 1110 1115 1120 Leu Val Ala Gln MetThr Leu Asp Glu Lys Ile Ser Phe Val His Trp 1125 1130 1135 Ala Leu AspPro Asp Arg Gln Asn Val Gly Tyr Leu Pro Gly Val Pro 1140 1145 1150 ArgLeu Gly Ile Pro Glu Leu Arg Ala Ala Asp Gly Pro Asn Gly Ile 1155 11601165 Arg Leu Val Gly Gln Thr Ala Thr Ala Leu Pro Ala Pro Val Ala Leu1170 1175 1180 Ala Ser Thr Phe Asp Asp Thr Met Ala Asp Ser Tyr Gly LysVal Met 1185 1190 1195 1200 Gly Arg Asp Gly Arg Ala Leu Asn Gln Asp MetVal Leu Gly Pro Met 1205 1210 1215 Met Asn Asn Ile Arg Val Pro His GlyGly Arg Asn Tyr Glu Thr Phe 1220 1225 1230 Ser Glu Asp Pro Leu Val SerSer Arg Thr Ala Val Ala Gln Ile Lys 1235 1240 1245 Gly Ile Gln Gly AlaGly Leu Met Thr Thr Ala Lys His Phe Ala Ala 1250 1255 1260 Asn Asn GlnGlu Asn Asn Arg Phe Ser Val Asn Ala Asn Val Asp Glu 1265 1270 1275 1280Gln Thr Leu Arg Glu Ile Glu Phe Pro Ala Phe Glu Ala Ser Ser Lys 12851290 1295 Ala Gly Ala Ala Ser Phe Met Cys Ala Tyr Asn Gly Leu Asn GlyLys 1300 1305 1310 Pro Ser Cys Gly Asn Asp Glu Leu Leu Asn Asn Val LeuArg Thr Gln 1315 1320 1325 Trp Gly Phe Gln Gly Trp Val Met Ser Asp TrpLeu Ala Thr Pro Gly 1330 1335 1340 Thr Asp Ala Ile Thr Lys Gly Leu AspGln Glu Met Gly Val Glu Leu 1345 1350 1355 1360 Pro Gly Asp Val Pro LysGly Glu Pro Ser Pro Pro Ala Lys Phe Phe 1365 1370 1375 Gly Glu Ala LeuLys Thr Ala Val Leu Asn Gly Thr Val Pro Glu Ala 1380 1385 1390 Ala ValThr Arg Ser Ala Glu Arg Ile Val Gly Gln Met Glu Lys Phe 1395 1400 1405Gly Leu Leu Leu Ala Thr Pro Ala Pro Arg Pro Glu Arg Asp Lys Ala 14101415 1420 Gly Ala Gln Ala Val Ser Arg Lys Val Ala Glu Asn Gly Ala ValLeu 1425 1430 1435 1440 Leu Arg Asn Glu Gly Gln Ala Leu Pro Leu Ala GlyAsp Ala Gly Lys 1445 1450 1455 Ser Ile Ala Val Ile Gly Pro Thr Ala ValAsp Pro Lys Val Thr Gly 1460 1465 1470 Leu Gly Ser Ala His Val Val ProAsp Ser Ala Ala Ala Pro Leu Asp 1475 1480 1485 Thr Ile Lys Ala Arg AlaGly Ala Gly Ala Thr Val Thr Tyr Glu Thr 1490 1495 1500 Gly Glu Glu ThrPhe Gly Thr Gln Ile Pro Ala Gly Asn Leu Ser Pro 1505 1510 1515 1520 AlaPhe Asn Gln Gly His Gln Leu Glu Pro Gly Lys Ala Gly Ala Leu 1525 15301535 Tyr Asp Gly Thr Leu Thr Val Pro Ala Asp Gly Glu Tyr Arg Ile Ala1540 1545 1550 Val Arg Ala Thr Gly Gly Tyr Ala Thr Val Gln Leu Gly SerHis Thr 1555 1560 1565 Ile Glu Ala Gly Gln Val Tyr Gly Lys Val Ser SerPro Leu Leu Lys 1570 1575 1580 Leu Thr Lys Gly Thr His Lys Leu Thr IleSer Gly Phe Ala Met Ser 1585 1590 1595 1600 Ala Thr Pro Leu Ser Leu GluLeu Gly Trp Val Thr Pro Ala Ala Ala 1605 1610 1615 Asp Ala Thr Ile AlaLys Ala Val Glu Ser Ala Arg Lys Ala Arg Thr 1620 1625 1630 Ala Val ValPhe Ala Tyr Asp Asp Gly Thr Glu Gly Val Asp Arg Pro 1635 1640 1645 AsnLeu Ser Leu Pro Gly Thr Gln Asp Lys Leu Ile Ser Ala Val Ala 1650 16551660 Asp Ala Asn Pro Asn Thr Ile Val Val Leu Asn Thr Gly Ser Ser Val1665 1670 1675 1680 Leu Met Pro Trp Leu Ser Lys Thr Arg Ala Val Leu AspMet Trp Tyr 1685 1690 1695 Pro Gly Gln Ala Gly Ala Glu Ala Thr Ala AlaLeu Leu Tyr Gly Asp 1700 1705 1710 Val Asn Pro Ser Gly Lys Leu Thr GlnSer Phe Pro Ala Ala Glu Asn 1715 1720 1725 Gln His Ala Val Ala Gly AspPro Thr Ser Tyr Pro Gly Val Asp Asn 1730 1735 1740 Gln Gln Thr Tyr ArgGlu Gly Ile His Val Gly Tyr Arg Trp Phe Asp 1745 1750 1755 1760 Lys GluAsn Val Lys Pro Leu Phe Pro Phe Gly His Gly Leu Ser Tyr 1765 1770 1775Thr Ser Phe Thr Gln Ser Ala Pro Thr Val Val Arg Thr Ser Thr Gly 17801785 1790 Gly Leu Lys Val Thr Val Thr Val Arg Asn Ser Gly Lys Arg AlaGly 1795 1800 1805 Gln Glu Val Val Gln Ala Tyr Leu Gly Ala Ser Pro AsnVal Thr Ala 1810 1815 1820 Pro Gln Ala Lys Lys Lys Leu Val Gly Tyr ThrLys Val Ser Leu Ala 1825 1830 1835 1840 Ala Gly Glu Ala Lys Thr Val ThrVal Asn Val Asp Arg Arg Gln Leu 1845 1850 1855 Gln Thr Gly Ser Ser SerAla Asp Leu Arg Gly Ser Ala Thr Val Asn 1860 1865 1870 Val Trp Met SerSer Arg Ala Glu Thr Pro Arg Val Pro Phe Leu Asp 1875 1880 1885 Leu LysAla Ala Tyr Glu Glu Leu Arg Ala Glu Thr Asp Ala Ala Ile 1890 1895 1900Ala Arg Val Leu Asp Ser Gly Arg Tyr Leu Leu Gly Pro Glu Leu Glu 19051910 1915 1920 Gly Phe Glu Ala Glu Phe Ala Ala Tyr Cys Glu Thr Asp HisAla Val 1925 1930 1935 Gly Val Asn Ser Gly Met Asp Ala Leu Gln Leu AlaLeu Arg Gly Leu 1940 1945 1950 Gly Ile Gly Pro Gly Asp Glu Val Ile ValPro Ser His Thr Tyr Ile 1955 1960 1965 Ala Ser Trp Leu Ala Val Ser AlaThr Gly Ala Thr Pro Val Pro Val 1970 1975 1980 Glu Pro His Glu Asp HisPro Thr Leu Asp Pro Leu Leu Val Glu Lys 1985 1990 1995 2000 Ala Ile ThrPro Arg Thr Arg Ala Leu Leu Pro Val His Leu Tyr Gly 2005 2010 2015 HisPro Ala Asp Met Asp Ala Leu Arg Glu Leu Ala Asp Arg His Gly 2020 20252030 Leu His Ile Val Glu Asp Ala Ala Gln Ala His Gly Ala Arg Tyr Arg2035 2040 2045 Gly Arg Arg Ile Gly Ala Gly Ser Ser Val Ala Ala Phe SerPhe Tyr 2050 2055 2060 Pro Gly Lys Asn Leu Gly Cys Phe Gly Asp Gly GlyAla Val Val Thr 2065 2070 2075 2080 Gly Asp Pro Glu Leu Ala Glu Arg LeuArg Met Leu Arg Asn Tyr Gly 2085 2090 2095 Ser Arg Gln Lys Tyr Ser HisGlu Thr Lys Gly Thr Asn Ser Arg Leu 2100 2105 2110 Asp Glu Met Gln AlaAla Val Leu Arg Ile Arg Leu Ala His Leu Asp 2115 2120 2125 Ser Trp AsnGly Arg Arg Ser Ala Leu Ala Ala Glu Tyr Leu Ser Gly 2130 2135 2140 LeuAla Gly Leu Pro Gly Ile Gly Leu Pro Val Thr Ala Pro Asp Thr 2145 21502155 2160 Asp Pro Val Trp His Leu Phe Thr Val Arg Thr Glu Arg Arg AspGlu 2165 2170 2175 Leu Arg Ser His Leu Asp Ala Arg Gly Ile Asp Thr LeuThr His Tyr 2180 2185 2190 Pro Val Pro Val His Leu Ser Pro Ala Tyr AlaGly Glu Ala Pro Pro 2195 2200 2205 Glu Gly Ser Leu Pro Arg Ala Glu SerPhe Ala Arg Gln Val Leu Ser 2210 2215 2220 Leu Pro Ile Gly Pro His LeuGlu Arg Pro Gln Ala Leu Arg Val Ile 2225 2230 2235 2240 Asp Ala Val ArgGlu Trp Ala Glu Arg Val Asp Gln Ala Met Arg Leu 2245 2250 2255 Leu ValThr Gly Gly Ala Gly Phe Ile Gly Ser His Phe Val Arg Gln 2260 2265 2270Leu Leu Ala Gly Ala Tyr Pro Asp Val Pro Ala Asp Glu Val Ile Val 22752280 2285 Leu Asp Ser Leu Thr Tyr Ala Gly Asn Arg Ala Asn Leu Ala ProVal 2290 2295 2300 Asp Ala Asp Pro Arg Leu Arg Phe Val His Gly Asp IleArg Asp Ala 2305 2310 2315 2320 Gly Leu Leu Ala Arg Glu Leu Arg Gly ValAsp Ala Ile Val His Phe 2325 2330 2335 Ala Ala Glu Ser His Val Asp ArgSer Ile Ala Gly Ala Ser Val Phe 2340 2345 2350 Thr Glu Thr Asn Val GlnGly Thr Gln Thr Leu Leu Gln Cys Ala Val 2355 2360 2365 Asp Ala Gly ValGly Arg Val Val His Val Ser Thr Asp Glu Val Tyr 2370 2375 2380 Gly SerIle Asp Ser Gly Ser Trp Thr Glu Ser Ser Pro Leu Glu Pro 2385 2390 23952400 Asn Ser Pro Tyr Ala Ala Ser Lys Ala Gly Ser Asp Leu Val Ala Arg2405 2410 2415 Ala Tyr His Arg Thr Tyr Gly Leu Asp Val Arg Ile Thr ArgCys Cys 2420 2425 2430 Asn Asn Tyr Gly Pro Tyr Gln His Pro Glu Lys LeuIle Pro Leu Phe 2435 2440 2445 Val Thr Asn Leu Leu Asp Gly Gly Thr LeuPro Leu Tyr Gly Asp Gly 2450 2455 2460 Ala Asn Val Arg Glu Trp Val HisThr Asp Asp His Cys Arg Gly Ile 2465 2470 2475 2480 Ala Leu Val Leu AlaGly Gly Arg Ala Gly Glu Ile Tyr His Ile Gly 2485 2490 2495 Gly Gly LeuGlu Leu Thr Asn Arg Glu Leu Thr Gly Ile Leu Leu Asp 2500 2505 2510 SerLeu Gly Ala Asp Trp Ser Ser Val Arg Lys Val Ala Asp Arg Lys 2515 25202525 Gly His Asp Leu Arg Tyr Ser Leu Asp Gly Gly Glu Ile Glu Arg Glu2530 2535 2540 Leu Gly Tyr Arg Pro Gln Val Ser Phe Ala Asp Gly Leu AlaArg Thr 2545 2550 2555 2560 Val Arg Trp Tyr Arg Glu Asn Arg Gly Trp TrpGlu Pro Leu Lys Ala 2565 2570 2575 Thr Ala Pro Gln Leu Pro Ala Thr AlaVal Glu Val Ser Ala Met Lys 2580 2585 2590 Gly Ile Val Leu Ala Gly GlySer Gly Thr Arg Leu His Pro Ala Thr 2595 2600 2605 Ser Val Ile Ser LysGln Ile Leu Pro Val Tyr Asn Lys Pro Met Ile 2610 2615 2620 Tyr Tyr ProLeu Ser Val Leu Met Leu Gly Gly Ile Arg Glu Ile Gln 2625 2630 2635 2640Ile Ile Ser Thr Pro Gln His Ile Glu Leu Phe Gln Ser Leu Leu Gly 26452650 2655 Asn Gly Arg His Leu Gly Ile Glu Leu Asp Tyr Ala Val Gln LysGlu 2660 2665 2670 Pro Ala Gly Ile Ala Asp Ala Leu Leu Val Gly Ala GluHis Ile Gly 2675 2680 2685 Asp Asp Thr Cys Ala Leu Ile Leu Gly Asp AsnIle Phe His Gly Pro 2690 2695 2700 Gly Leu Tyr Thr Leu Leu Arg Asp SerIle Ala Arg Leu Asp Gly Cys 2705 2710 2715 2720 Val Leu Phe Gly Tyr ProVal Lys Asp Pro Glu Arg Tyr Gly Val Ala 2725 2730 2735 Glu Val Asp AlaThr Gly Arg Leu Thr Asp Leu Val Glu Lys Pro Val 2740 2745 2750 Lys ProArg Ser Asn Leu Ala Val Thr Gly Leu Tyr Leu Tyr Asp Asn 2755 2760 2765Asp Val Val Asp Ile Ala Lys Asn Ile Arg Pro Ser Pro Arg Gly Glu 27702775 2780 Leu Glu Ile Thr Asp Val Asn Arg Val Tyr Leu Glu Arg Gly ArgAla 2785 2790 2795 2800 Glu Leu Val Asn Leu Gly Arg Gly Phe Ala Trp LeuAsp Thr Gly Thr 2805 2810 2815 His Asp Ser Leu Leu Arg Ala Ala Gln TyrVal Gln Val Leu Glu Glu 2820 2825 2830 Arg Gln Gly Val Trp Ile Ala GlyLeu Glu Glu Ile Ala Phe Arg Met 2835 2840 2845 Gly Phe Ile Asp Ala GluAla Cys His Gly Leu Gly Glu Gly Leu Ser 2850 2855 2860 Arg Thr Glu TyrGly Ser Tyr Leu Met Glu Ile Ala Gly Arg Glu Gly 2865 2870 2875 2880 AlaPro Met Thr Ala Pro Ala Leu Ser Ala Thr Ala Pro Ala Glu Arg 2885 28902895 Cys Ala His Pro Gly Ala Asp Leu Gly Ala Ala Val His Ala Val Gly2900 2905 2910 Gln Thr Leu Ala Ala Gly Gly Leu Val Pro Pro Asp Glu AlaGly Thr 2915 2920 2925 Thr Ala Arg His Leu Val Arg Leu Ala Val Arg TyrGly Asn Ser Pro 2930 2935 2940 Phe Thr Pro Leu Glu Glu Ala Arg His AspLeu Gly Val Asp Arg Asp 2945 2950 2955 2960 Ala Phe Arg Arg Leu Leu AlaLeu Phe Gly Gln Val Pro Glu Leu Arg 2965 2970 2975 Thr Ala Val Glu ThrGly Pro Ala Gly Ala Tyr Trp Lys Asn Thr Leu 2980 2985 2990 Leu Pro LeuGlu Gln Arg Gly Val Phe Asp Ala Ala Leu Ala Arg Lys 2995 3000 3005 ProVal Phe Pro Tyr Ser Val Gly Leu Tyr Pro Gly Pro Thr Cys Met 3010 30153020 Phe Arg Cys His Phe Cys Val Arg Val Thr Gly Ala Arg Tyr Asp Pro3025 3030 3035 3040 Ser Ala Leu Asp Ala Gly Asn Ala Met Phe Arg Ser ValIle Asp Glu 3045 3050 3055 Ile Pro Ala Gly Asn Pro Ser Ala Met Tyr PheSer Gly Gly Leu Glu 3060 3065 3070 Pro Leu Thr Asn Pro Gly Leu Gly SerLeu Ala Ala His Ala Thr Asp 3075 3080 3085 His Gly Leu Arg Pro Thr ValTyr Thr Asn Ser Phe Ala Leu Thr Glu 3090 3095 3100 Arg Thr Leu Glu ArgGln Pro Gly Leu Trp Gly Leu His Ala Ile Arg 3105 3110 3115 3120 Thr SerLeu Tyr Gly Leu Asn Asp Glu Glu Tyr Glu Gln Thr Thr Gly 3125 3130 3135Lys Lys Ala Ala Phe Arg Arg Val Arg Glu Asn Leu Arg Arg Phe Gln 31403145 3150 Gln Leu Arg Ala Glu Arg Glu Ser Pro Ile Asn Leu Gly Phe AlaTyr 3155 3160 3165 Ile Val Leu Pro Gly Arg Ala Ser Arg Leu Leu Asp LeuVal Asp Phe 3170 3175 3180 Ile Ala Asp Leu Asn Asp Ala Gly Gln Gly ArgThr Ile Asp Phe Val 3185 3190 3195 3200 Asn Ile Arg Glu Asp Tyr Ser GlyArg Asp Asp Gly Lys Leu Pro Gln 3205 3210 3215 Glu Glu Arg Ala Glu LeuGln Glu Ala Leu Asn Ala Phe Glu Glu Arg 3220 3225 3230 Val Arg Glu ArgThr Pro Gly Leu His Ile Asp Tyr Gly Tyr Ala Leu 3235 3240 3245 Asn SerLeu Arg Thr Gly Ala Asp Ala Glu Leu Leu Arg Ile Lys Pro 3250 3255 3260Ala Thr Met Arg Pro Thr Ala His Pro Gln Val Ala Val Gln Val Asp 32653270 3275 3280 Leu Leu Gly Asp Val Tyr Leu Tyr Arg Glu Ala Gly Phe ProAsp Leu 3285 3290 3295 Asp Gly Ala Thr Arg Tyr Ile Ala Gly Arg Val ThrPro Asp Thr Ser 3300 3305 3310 Leu Thr Glu Val Val Arg Asp Phe Val GluArg Gly Gly Glu Val Ala 3315 3320 3325 Ala Val Asp Gly Asp Glu Tyr PheMet Asp Gly Phe Asp Gln Val Val 3330 3335 3340 Thr Ala Arg Leu Asn GlnLeu Glu Arg Asp Ala Ala Asp Gly Trp Glu 3345 3350 3355 3360 Glu Ala ArgGly Phe Leu Arg Met Lys Ser Ala Leu Ser Asp Leu Ala 3365 3370 3375 PhePhe Gly Gly Pro Ala Ala Phe Asp Gln Pro Leu Leu Val Gly Arg 3380 33853390 Pro Asn Arg Ile Asp Arg Ala Arg Leu Tyr Glu Arg Leu Asp Arg Ala3395 3400 3405 Leu Asp Ser Gln Trp Leu Ser Asn Gly Gly Pro Leu Val ArgGlu Phe 3410 3415 3420 Glu Glu Arg Val Ala Gly Leu Ala Gly Val Arg HisAla Val Ala Thr 3425 3430 3435 3440 Cys Asn Ala Thr Ala Gly Leu Gln LeuLeu Ala His Ala Ala Gly Leu 3445 3450 3455 Thr Gly Glu Val Ile Met ProSer Met Thr Phe Ala Ala Thr Pro His 3460 3465 3470 Ala Leu Arg Trp IleGly Leu Thr Pro Val Phe Ala Asp Ile Asp Pro 3475 3480 3485 Asp Thr GlyAsn Leu Asp Pro Asp Gln Val Ala Ala Ala Val Thr Pro 3490 3495 3500 ArgThr Ser Ala Val Val Gly Val His Leu Trp Gly Arg Pro Cys Ala 3505 35103515 3520 Ala Asp Gln Leu Arg Lys Val Ala Asp Glu His Gly Leu Arg LeuTyr 3525 3530 3535 Phe Asp Ala Ala His Ala Leu Gly Cys Ala Val Asp GlyArg Pro Ala 3540 3545 3550 Gly Ser Leu Gly Asp Ala Glu Val Phe Ser PheHis Ala Thr Lys Ala 3555 3560 3565 Val Asn Ala Phe Glu Gly Gly Ala ValVal Thr Asp Asp Ala Asp Leu 3570 3575 3580 Ala Ala Arg Ile Arg Ala LeuHis Asn Phe Gly Phe Asp Leu Pro Gly 3585 3590 3595 3600 Gly Ser Pro AlaGly Gly Thr Asn Ala Lys Met Ser Glu Ala Ala Ala 3605 3610 3615 Ala MetGly Leu Thr Ser Leu Asp Ala Phe Pro Glu Val Ile Asp Arg 3620 3625 3630Asn Arg Arg Asn His Ala Ala Tyr Arg Glu His Leu Ala Asp Leu Pro 36353640 3645 Gly Val Leu Val Ala Asp His Asp Arg His Gly Leu Asn Asn HisGln 3650 3655 3660 Tyr Val Ile Val Glu Ile Asp Glu Ala Thr Thr Gly IleHis Arg Asp 3665 3670 3675 3680 Leu Val Met Glu Val Leu Lys Ala Glu GlyVal His Thr Arg Ala Tyr 3685 3690 3695 Phe Ser Pro Gly Cys His Glu LeuGlu Pro Tyr Arg Gly Gln Pro His 3700 3705 3710 Ala Pro Leu Pro His ThrGlu Arg Leu Ala Ala Arg Val Leu Ser Leu 3715 3720 3725 Pro Thr Gly ThrAla Ile Gly Asp Asp Asp Ile Arg Arg Val Ala Asp 3730 3735 3740 Leu LeuArg Leu Cys Ala Thr Arg Gly Arg Glu Leu Thr Ala Arg His 3745 3750 37553760 Arg Asp Thr Ala Pro Ala Pro Leu Ala Ala Pro Gln Thr Ser Thr Pro3765 3770 3775 Thr Ile Gly Arg Ser Arg 3780 5 36778 DNA Streptomycesvenezuelae 5 ggatccgacc gtgggtgtga atctccgggt gctcgcctcg tcctgccccgttacctgtcc 60 gcctcccgct ccagaccagc gggaggcgga caggggcatg cccgccgggcggctaacggc 120 ccgtgcggcg tccgtacgac gagcctcgcg cgccctggcg gcccttggtctgccggacct 180 gtgcgcgggg tgcgcagggt tcgccgccgc gcgtggggcc gtatctgcggctcccgggca 240 cggcggccct gctcgtctcc gagtcatagt ccctgccgcc ggcgccaccgccctggcccg 300 gcatgcgcgt gccgggcgcc cccggcgcgt aactcggctg ggaggcctggaaaagggcga 360 tccattgggt gagcgtgagg tccttcggca gtccgccgtc cggaattccgtggcggtcgg 420 cgagggaacg gtaggtccgc ttggggatgt ggcgccggag gatctccgcgaggccccgtc 480 cggggccggt gaagacggct tcggcgaagt tctggaaggc gcggctcgcgctctcgggca 540 gcaggggctg ggggcgtcgc ctgatcgtca ggacgccgcc gtcgacgcggggcatcggac 600 ggaacgacga ggcgcggacg cggtcgtgga ccgcgaactc gtaccagggggcccaggagg 660 tcgtgaggag cgatccgccg ctgcgaccgg cgcgtttgcg ggcgacctcccactgcacta 720 tcagggccgc cgactgccag ttcgtcgatt ccaggagact ccggagaatctgggtcgtga 780 tgccgaaggg aacgtttccg acgacggtgt cgatatcgcg cggaatgcggaagtcgagga 840 aatcaccctg gaatacggtg accctctccc cttcgaattt ccgccgcacatgcgcggccc 900 agtgcgggtc catctccacg accgtcacgg tgtcgaagga gcgcaccaactcctcggtta 960 tcgcgccctt tccggggccg atttcgagaa cgttcctacc gtccccctcgacatgcgtga 1020 cgagattgcg cacggctctg tcgtcctgaa ggaagttctg gcctaattcgcggcgaaggg 1080 tgtcgcggtc cgctcgcctc ggtatggagt cgcgcattgc catgaacgatcccctccctg 1140 gatgccgtgg tcaatggact tggcacggac catacctcac ggtccgtcggacgaccggag 1200 aagaagttca cgcacgggcg ttccggagta cgggagttgt gaacggccgcgacgaagtcg 1260 gtcgcggctc ggcgggcggt gacgagcgag gtccggagga acgcgacgaagcagccgaac 1320 cccaagtgag gtgcgacgga gtgacattgg gggcatacgg agggttgtcgtacggagcgc 1380 actcaacgag gctccaggag ggaggggttg aacccgccgc cgactggccttcgccgcccg 1440 cgcggccgga gtatgtcatg tcgggggtga aatcaagcca ttcccccgggatcggctgtt 1500 acccatccct ttacctggcg tggatttccc aacccttggt atagagcgggagacgacgcg 1560 acaccatgga gaccacgcac accacgagcg ccaccccccg gccatcccgacaaggggggt 1620 ccggctcgcc tcccgacacc catggcctgg ggtacacgcc aggtatagggggaacgtagg 1680 gggagcatag ggggggtgcc ctggggttgg gtgaaagcgc ggcttccggagacggagccg 1740 gatgtcttca gccggaatta ccaggaccgg tgcgagaaca ccggtgacagggcgtggggc 1800 ggcagcgtgg gacacggggg aagtgcgggt ccgacggggg ttgccccctgccggccccga 1860 tcatgcggag cactccttct ctcgtgctcc taccggtgat gtgcgcgccgaattgattcg 1920 tggagagatg tcgacagtgt ccaagagtga gtccgaggaa ttcgtgtccgtgtcgaacga 1980 cgccggttcc gcgcacggca cagcggaacc cgtcgccgtc gtcggcatctcctgccgggt 2040 gcccggcgcc cgggacccga gagagttctg ggaactcctg gcggcaggcggccaggccgt 2100 caccgacgtc cccgcggacc gctggaacgc cggcgacttc tacgacccggaccgctccgc 2160 ccccggccgc tcgaacagcc ggtggggcgg gttcatcgag gacgtcgaccggttcgacgc 2220 cgccttcttc ggcatctcgc cccgcgaggc cgcggagatg gacccgcagcagcggctcgc 2280 cctggagctg ggctgggagg ccctggagcg cgccgggatc gacccgtcctcgctcaccgg 2340 cacccgcacc ggcgtcttcg ccggcgccat ctgggacgac tacgccaccctgaagcaccg 2400 ccagggcggc gccgcgatca ccccgcacac cgtcaccggc ctccaccgcggcatcatcgc 2460 gaaccgactc tcgtacacgc tcgggctccg cggccccagc atggtcgtcgactccggcca 2520 gtcctcgtcg ctcgtcgccg tccacctcgc gtgcgagagc ctgcggcgcggcgagtccga 2580 gctcgccctc gccggcggcg tctcgctcaa cctggtgccg gacagcatcatcggggcgag 2640 caagttcggc ggcctctccc ccgacggccg cgcctacacc ttcgacgcgcgcgccaacgg 2700 ctacgtacgc ggcgagggcg gcggtttcgt cgtcctgaag cgcctctcccgggccgtcgc 2760 cgacggcgac ccggtgctcg ccgtgatccg gggcagcgcc gtcaacaacggcggcgccgc 2820 ccagggcatg acgacccccg acgcgcaggc gcaggaggcc gtgctccgcgaggcccacga 2880 gcgggccggg accgcgccgg ccgacgtgcg gtacgtcgag ctgcacggcaccggcacccc 2940 cgtgggcgac ccgatcgagg ccgctgcgct cggcgccgcc ctcggcaccggccgcccggc 3000 cggacagccg ctcctggtcg gctcggtcaa gacgaacatc ggccacctggagggcgcggc 3060 cggcatcgcc ggcctcatca aggccgtcct ggcggtccgc ggtcgcgcgctgcccgccag 3120 cctgaactac gagaccccga acccggcgat cccgttcgag gaactgaacctccgggtgaa 3180 cacggagtac ctgccgtggg agccggagca cgacgggcag cggatggtcgtcggcgtgtc 3240 ctcgttcggc atgggcggca cgaacgcgca tgtcgtgctc gaagaggcccccgggggttg 3300 tcgaggtgct tcggtcgtgg agtcgacggt cggcgggtcg gcggtcggcggcggtgtggt 3360 gccgtgggtg gtgtcggcga agtccgctgc cgcgctggac gcgcagatcgagcggcttgc 3420 cgcgttcgcc tcgcgggatc gtacggatgg tgtcgacgcg ggcgctgtcgatgcgggtgc 3480 tgtcgatgcg ggtgctgtcg ctcgcgtact ggccggcggg cgtgctcagttcgagcaccg 3540 ggccgtcgtc gtcggcagcg ggccggacga tctggcggca gcgctggccgcgcctgaggg 3600 tctggtccgg ggcgtggctt ccggtgtcgg gcgagtggcg ttcgtgttccccgggcaggg 3660 cacgcagtgg gccggcatgg gtgccgaact gctggactct tccgcggtgttcgcggcggc 3720 catggccgaa tgcgaggccg cactctcccc gtacgtcgac tggtcgctggaggccgtcgt 3780 acggcaggcc cccggtgcgc ccacgctgga gcgggtcgat gtcgtgcagcctgtgacgtt 3840 cgccgtcatg gtctcgctgg ctcgcgtgtg gcagcaccac ggggtgacgccccaggcggt 3900 cgtcggccac tcgcagggcg agatcgccgc cgcgtacgtc gccggtgccctgagcctgga 3960 cgacgccgct cgtgtcgtga ccctgcgcag caagtccatc gccgcccacctcgccggcaa 4020 gggcggcatg ctgtccctcg cgctgagcga ggacgccgtc ctggagcgactggccgggtt 4080 cgacgggctg tccgtcgccg ctgtgaacgg gcccaccgcc accgtggtctccggtgaccc 4140 cgtacagatc gaagagcttg ctcgggcgtg tgaggccgat ggggtccgtgcgcgggtcat 4200 tcccgtcgac tacgcgtccc acagccggca ggtcgagatc atcgagagcgagctcgccga 4260 ggtcctcgcc gggctcagcc cgcaggctcc gcgcgtgccg ttcttctcgacactcgaagg 4320 cgcctggatc accgagcccg tgctcgacgg cggctactgg taccgcaacctgcgccatcg 4380 tgtgggcttc gccccggccg tcgagaccct ggccaccgac gagggcttcacccacttcgt 4440 cgaggtcagc gcccaccccg tcctcaccat ggccctcccc gggaccgtcaccggtctggc 4500 gaccctgcgt cgcgacaacg gcggtcagga ccgcctagtc gcctccctcgccgaagcatg 4560 ggccaacgga ctcgcggtcg actggagccc gctcctcccc tccgcgaccggccaccactc 4620 cgacctcccc acctacgcgt tccagaccga gcgccactgg ctgggcgagatcgaggcgct 4680 cgccccggcg ggcgagccgg cggtgcagcc cgccgtcctc cgcacggaggcggccgagcc 4740 ggcggagctc gaccgggacg agcagctgcg cgtgatcctg gacaaggtccgggcgcagac 4800 ggcccaggtg ctggggtacg cgacaggcgg gcagatcgag gtcgaccggaccttccgtga 4860 ggccggttgc acctccctga ccggcgtgga cctgcgcaac cggatcaacgccgccttcgg 4920 cgtacggatg gcgccgtcca tgatcttcga cttccccacc cccgaggctctcgcggagca 4980 gctgctcctc gtcgtgcacg gggaggcggc ggcgaacccg gccggtgcggagccggctcc 5040 ggtggcggcg gccggtgccg tcgacgagcc ggtggcgatc gtcggcatggcctgccgcct 5100 gcccggtggg gtcgcctcgc cggaggacct gtggcggctg gtggccggcggcggggacgc 5160 gatctcggag ttcccgcagg accgcggctg ggacgtggag gggctgtaccacccggatcc 5220 ggagcacccc ggcacgtcgt acgtccgcca gggcggtttc atcgagaacgtcgccggctt 5280 cgacgcggcc ttcttcggga tctcgccgcg cgaggccctc gccatggacccgcagcagcg 5340 gctcctcctc gaaacctcct gggaggccgt cgaggacgcc gggatcgacccgacctccct 5400 gcggggacgg caggtcggcg tcttcactgg ggcgatgacc cacgagtacgggccgagcct 5460 gcgggacggc ggggaaggcc tcgacggcta cctgctgacc ggcaacacggccagcgtgat 5520 gtcgggccgc gtctcgtaca cactcggcct tgagggcccc gccctgacggtggacacggc 5580 ctgctcgtcg tcgctggtcg ccctgcacct cgccgtgcag gccctgcgcaagggcgaggt 5640 cgacatggcg ctcgccggcg gcgtggccgt gatgcccacg cccgggatgttcgtcgagtt 5700 cagccggcag cgcgggctgg ccggggacgg ccggtcgaag gcgttcgccgcgtcggcgga 5760 cggcaccagc tggtccgagg gcgtcggcgt cctcctcgtc gagcgcctgtcggacgcccg 5820 ccgcaacgga caccaggtcc tcgcggtcgt ccgcggcagc gccttgaaccaggacggcgc 5880 gagcaacggc ctcacggctc cgaacgggcc ctcgcagcag cgcgtcatccggcgcgcgct 5940 ggcggacgcc cggctgacga cctccgacgt ggacgtcgtc gaggcacacggcacgggcac 6000 gcgactcggc gacccgatcg aggcgcaggc cctgatcgcc acctacggccagggccgtga 6060 cgacgaacag ccgctgcgcc tcgggtcgtt gaagtccaac atcgggcacacccaggccgc 6120 ggccggcgtc tccggtgtca tcaagatggt ccaggcgatg cgccacggactgctgccgaa 6180 gacgctgcac gtcgacgagc cctcggacca gatcgactgg tcggctggcgccgtggaact 6240 cctcaccgag gccgtcgact ggccggagaa gcaggacggc gggctgcgccgggccgccgt 6300 ctcctccttc gggatcagcg gcaccaatgc gcatgtggtg ctcgaagaggccccggtggt 6360 tgtcgagggt gcttcggtcg tcgagccgtc ggttggcggg tcggcggtcggcggcggtgt 6420 gacgccttgg gtggtgtcgg cgaagtccgc tgccgcgctc gacgcgcagatcgagcggct 6480 tgccgcattc gcctcgcggg atcgtacgga tgacgccgac gccggtgctgtcgacgcggg 6540 cgctgtcgct cacgtactgg ctgacgggcg tgctcagttc gagcaccgggccgtcgcgct 6600 cggcgccggg gcggacgacc tcgtacaggc gctggccgat ccggacgggctgatacgcgg 6660 aacggcttcc ggtgtcgggc gagtggcgtt cgtgttcccc ggtcagggcacgcagtgggc 6720 tggcatgggt gccgaactgc tggactcttc cgcggtgttc gcggcggccatggccgagtg 6780 tgaggccgcg ctgtccccgt acgtcgactg gtcgctggag gccgtcgtacggcaggcccc 6840 cggtgcgccc acgctggagc gggtcgatgt cgtgcagcct gtgacgttcgccgtcatggt 6900 ctcgctggct cgcgtgtggc agcaccacgg tgtgacgccc caggcggtcgtcggccactc 6960 gcagggcgag atcgccgccg cgtacgtcgc cggagccctg cccctggacgacgccgcccg 7020 cgtcgtcacc ctgcgcagca agtccatcgc cgcccacctc gccggcaagggcggcatgct 7080 gtccctcgcg ctgaacgagg acgccgtcct ggagcgactg agtgacttcgacgggctgtc 7140 cgtcgccgcc gtcaacgggc ccaccgccac tgtcgtgtcg ggtgaccccgtacagatcga 7200 agagcttgct caggcgtgca aggcggacgg attccgcgcg cggatcattcccgtcgacta 7260 cgcgtcccac agccggcagg tcgagatcat cgagagcgag ctcgcccaggtcctcgccgg 7320 tctcagcccg caggccccgc gcgtgccgtt cttctcgacg ctcgaaggcacctggatcac 7380 cgagcccgtc ctcgacggca cctactggta ccgcaacctc cgtcaccgcgtcggcttcgc 7440 ccccgccatc gagaccctgg ccgtcgacga gggcttcacg cacttcgtcgaggtcagcgc 7500 ccaccccgtc ctcaccatga ccctccccga gaccgtcacc ggcctcggcaccctccgtcg 7560 cgaacaggga ggccaagagc gtctggtcac ctcgctcgcc gaggcgtgggtcaacgggct 7620 tcccgtggca tggacttcgc tcctgcccgc cacggcctcc cgccccggtctgcccaccta 7680 cgccttccag gccgagcgct actggctcga gaacactccc gccgccctggccaccggcga 7740 cgactggcgc taccgcatcg actggaagcg cctcccggcc gccgaggggtccgagcgcac 7800 cggcctgtcc ggccgctggc tcgccgtcac gccggaggac cactccgcgcaggccgccgc 7860 cgtgctcacc gcgctggtcg acgccggggc gaaggtcgag gtgctgacggccggggcgga 7920 cgacgaccgt gaggccctcg ccgcccggct caccgcactg acgaccggtgacggcttcac 7980 cggcgtggtc tcgctcctcg acggactcgt accgcaggtc gcctgggtccaggcgctcgg 8040 cgacgccgga atcaaggcgc ccctgtggtc cgtcacccag ggcgcggtctccgtcggacg 8100 tctcgacacc cccgccgacc ccgaccgggc catgctctgg ggcctcggccgcgtcgtcgc 8160 ccttgagcac cccgaacgct gggccggcct cgtcgacctc cccgcccagcccgatgccgc 8220 cgccctcgcc cacctcgtca ccgcactctc cggcgccacc ggcgaggaccagatcgccat 8280 ccgcaccacc ggactccacg cccgccgcct cgcccgcgca cccctccacggacgtcggcc 8340 cacccgcgac tggcagcccc acggcaccgt cctcatcacc ggcggcaccggagccctcgg 8400 cagccacgcc gcacgctgga tggcccacca cggagccgaa cacctcctcctcgtcagccg 8460 cagcggcgaa caagcccccg gagccaccca actcaccgcc gaactcaccgcatcgggcgc 8520 ccgcgtcacc atcgccgcct gcgacgtcgc cgacccccac gccatgcgcaccctcctcga 8580 cgccatcccc gccgagacgc ccctcaccgc cgtcgtccac accgccggcgcgctcgacga 8640 cggcatcgtg gacacgctga ccgccgagca ggtccggcgg gcccaccgtgcgaaggccgt 8700 cggcgcctcg gtgctcgacg agctgacccg ggacctcgac ctcgacgcgttcgtgctctt 8760 ctcgtccgtg tcgagcactc tgggcatccc cggtcagggc aactacgccccgcacaacgc 8820 ctacctcgac gccctcgcgg ctcgccgccg ggccaccggc cggtccgccgtctcggtggc 8880 ctggggaccg tgggacggtg gcggcatggc cgccggtgac ggcgtggccgagcggctgcg 8940 caaccacggc gtgcccggca tggacccgga actcgccctg gccgcactggagtccgcgct 9000 cggccgggac gagaccgcga tcaccgtcgc ggacatcgac tgggaccgcttctacctcgc 9060 gtactcctcc ggtcgcccgc agcccctcgt cgaggagctg cccgaggtgcggcgcatcat 9120 cgacgcacgg gacagcgcca cgtccggaca gggcgggagc tccgcccagggcgccaaccc 9180 cctggccgag cggctggccg ccgcggctcc cggcgagcgt acggagatcctcctcggtct 9240 cgtacgggcg caggccgccg ccgtgctccg gatgcgttcg ccggaggacgtcgccgccga 9300 ccgcgccttc aaggacatcg gcttcgactc gctcgccggt gtcgagctgcgcaacaggct 9360 gacccgggcg accgggctcc agctgcccgc gacgctcgtc ttcgaccacccgacgccgct 9420 ggccctcgtg tcgctgctcc gcagcgagtt cctcggtgac gaggagacggcggacgcccg 9480 gcggtccgcg gcgctgcccg cgactgtcgg tgccggtgcc ggcgccggcgccggcaccga 9540 tgccgacgac gatccgatcg cgatcgtcgc gatgagctgc cgctaccccggtgacatccg 9600 cagcccggag gacctgtggc ggatgctgtc cgagggcggc gagggcatcacgccgttccc 9660 caccgaccgc ggctgggacc tcgacggcct gtacgacgcc gacccggacgcgctcggcag 9720 ggcgtacgtc cgcgagggcg ggttcctgca cgacgcggcc gagttcgacgcggagttctt 9780 cggcgtctcg ccgcgcgagg cgctggccat ggacccgcag cagcggatgctcctgacgac 9840 gtcctgggag gccttcgagc gggccggcat cgagccggca tcgctgcgcggcagcagcac 9900 cggtgtcttc atcggcctct cctaccagga ctacgcggcc cgcgtcccgaacgccccgcg 9960 tggcgtggag ggttacctgc tgaccggcag cacgccgagc gtcgcgtcgggccgtatcgc 10020 gtacaccttc ggtctcgaag ggcccgcgac gaccgtcgac accgcctgctcgtcgtcgct 10080 gaccgccctg cacctggcgg tgcgggcgct gcgcagcggc gagtgcacgatggcgctcgc 10140 cggtggcgtg gcgatgatgg cgaccccgca catgttcgtg gagttcagccgtcagcgggc 10200 gctcgccccg gacggccgca gcaaggcctt ctcggcggac gccgacgggttcggcgccgc 10260 ggagggcgtc ggcctgctgc tcgtggagcg gctctcggac gcgcggcgcaacggtcaccc 10320 ggtgctcgcc gtggtccgcg gtaccgccgt caaccaggac ggcgccagcaacgggctgac 10380 cgcgcccaac ggaccctcgc agcagcgggt gatccggcag gcgctcgccgacgcccggct 10440 ggcacccggc gacatcgacg ccgtcgagac gcacggcacg ggaacctcgctgggcgaccc 10500 catcgaggcc cagggcctcc aggccacgta cggcaaggag cggcccgcggaacggccgct 10560 cgccatcggc tccgtgaagt ccaacatcgg acacacccag gccgcggccggtgcggcggg 10620 catcatcaag atggtcctcg cgatgcgcca cggcaccctg ccgaagaccctccacgccga 10680 cgagccgagc ccgcacgtcg actgggcgaa cagcggcctg gccctcgtcaccgagccgat 10740 cgactggccg gccggcaccg gtccgcgccg cgccgccgtc tcctccttcggcatcagcgg 10800 gacgaacgcg cacgtcgtgc tggagcaggc gccggatgct gctggtgaggtgcttggggc 10860 cgatgaggtg cctgaggtgt ctgagacggt agcgatggct gggacggctgggacctccga 10920 ggtcgctgag ggctctgagg cctccgaggc ccccgcggcc cccggcagccgtgaggcgtc 10980 cctccccggg cacctgccct gggtgctgtc cgccaaggac gagcagtcgctgcgcggcca 11040 ggccgccgcc ctgcacgcgt ggctgtccga gcccgccgcc gacctgtcggacgcggacgg 11100 accggcccgc ctgcgggacg tcgggtacac gctcgccacg agccgtaccgccttcgcgca 11160 ccgcgccgcc gtgaccgccg ccgaccggga cgggttcctg gacgggctggccacgctggc 11220 ccagggcggc acctcggccc acgtccacct ggacaccgcc cgggacggcaccaccgcgtt 11280 cctcttcacc ggccagggca gtcagcgccc cggcgccggc cgtgagctgtacgaccggca 11340 ccccgtcttc gcccgggcgc tcgacgagat ctgcgcccac ctcgacggtcacctcgaact 11400 gcccctgctc gacgtgatgt tcgcggccga gggcagcgcg gaggccgcgctgctcgacga 11460 gacgcggtac acgcagtgcg cgctgttcgc cctggaggtc gcgctcttccggctcgtcga 11520 gagctggggc atgcggccgg ccgcactgct cggtcactcg gtcggcgagatcgccgccgc 11580 gcacgtcgcc ggtgtgttct cgctcgccga cgccgcccgc ctggtcgccgcgcgcggccg 11640 gctcatgcag gagctgcccg ccggtggcgc gatgctcgcc gtccaggccgcggaggacga 11700 gatccgcgtg tggctggaga cggaggagcg gtacgcggga cgtctggacgtcgccgccgt 11760 caacggcccc gaggccgccg tcctgtccgg cgacgcggac gcggcgcgggaggcggaggc 11820 gtactggtcc gggctcggcc gcaggacccg cgcgctgcgg gtcagccacgccttccactc 11880 cgcgcacatg gacggcatgc tcgacgggtt ccgcgccgtc ctggagacggtggagttccg 11940 gcgcccctcc ctgaccgtgg tctcgaacgt caccggcctg gccgccggcccggacgacct 12000 gtgcgacccc gagtactggg tccggcacgt ccgcggcacc gtccgcttcctcgacggcgt 12060 ccgtgtcctg cgcgacctcg gcgtgcggac ctgcctggag ctgggccccgacggggtcct 12120 caccgccatg gcggccgacg gcctcgcgga cacccccgcg gattccgctgccggctcccc 12180 cgtcggctct cccgccggct ctcccgccga ctccgccgcc ggcgcgctccggccccggcc 12240 gctgctcgtg gcgctgctgc gccgcaagcg gtcggagacc gagaccgtcgcggacgccct 12300 cggcagggcg cacgcccacg gcaccggacc cgactggcac gcctggttcgccggctccgg 12360 ggcgcaccgc gtggacctgc ccacgtactc cttccggcgc gaccgctactggctggacgc 12420 cccggcggcc gacaccgcgg tggacaccgc cggcctcggt ctcggcaccgccgaccaccc 12480 gctgctcggc gccgtggtca gccttccgga ccgggacggc ctgctgctcaccggccgcct 12540 ctccctgcgc acccacccgt ggctcgcgga ccacgccgtc ctggggagcgtcctgctccc 12600 cggcgccgcg atggtcgaac tcgccgcgca cgctgcggag tccgccggtctgcgtgacgt 12660 gcgggagctg accctccttg aaccgctggt actgcccgag cacggtggcgtcgagctgcg 12720 cgtgacggtc ggggcgccgg ccggagagcc cggtggcgag tcggccggggacggcgcacg 12780 gcccgtctcc ctccactcgc ggctcgccga cgcgcccgcc ggtaccgcctggtcctgcca 12840 cgcgaccggt ctgctggcca ccgaccggcc cgagcttccc gtcgcgcccgaccgtgcggc 12900 catgtggccg ccgcagggcg ccgaggaggt gccgctcgac ggtctctacgagcggctcga 12960 cgggaacggc ctcgccttcg gtccgctgtt ccaggggctg aacgcggtgtggcggtacga 13020 gggtgaggtc ttcgccgaca tcgcgctccc cgccaccacg aatgcgaccgcgcccgcgac 13080 cgcgaacggc ggcgggagtg cggcggcggc cccctacggc atccaccccgccctgctcga 13140 cgcttcgctg cacgccatcg cggtcggcgg tctcgtcgac gagcccgagctcgtccgcgt 13200 ccccttccac tggagcggtg tcaccgtgca cgcggccggt gccgcggcggcccgggtccg 13260 tctcgcctcc gcggggacgg acgccgtctc gctgtccctg acggacggcgagggacgccc 13320 gctggtctcc gtggaacggc tcacgctgcg cccggtcacc gccgatcaggcggcggcgag 13380 ccgcgtcggc gggctgatgc accgggtggc ctggcgtccg tacgccctcgcctcgtccgg 13440 cgaacaggac ccgcacgcca cttcgtacgg gccgaccgcc gtcctcggcaaggacgagct 13500 gaaggtcgcc gccgccctgg agtccgcggg cgtcgaagtc gggctctaccccgacctggc 13560 cgcgctgtcc caggacgtgg cggccggcgc cccggcgccc cgtaccgtccttgcgccgct 13620 gcccgcgggt cccgccgacg gcggcgcgga gggtgtacgg ggcacggtggcccggacgct 13680 ggagctgctc caggcctggc tggccgacga gcacctcgcg ggcacccgcctgctcctggt 13740 cacccgcggt gcggtgcggg accccgaggg gtccggcgcc gacgatggcggcgaggacct 13800 gtcgcacgcg gccgcctggg gtctcgtacg gaccgcgcag accgagaaccccggccgctt 13860 cggccttctc gacctggccg acgacgcctc gtcgtaccgg accctgccgtcggtgctctc 13920 cgacgcgggc ctgcgcgacg aaccgcagct cgccctgcac gacggcaccatcaggctggc 13980 ccgcctggcc tccgtccggc ccgagaccgg caccgccgca ccggcgctcgccccggaggg 14040 cacggtcctg ctgaccggcg gcaccggcgg cctgggcgga ctggtcgcccggcacgtggt 14100 gggcgagtgg ggcgtacgac gcctgctgct ggtgagccgg cggggcacggacgccccggg 14160 cgccgacgag ctcgtgcacg agctggaggc cctgggagcc gacgtctcggtggccgcgtg 14220 cgacgtcgcc gaccgcgaag ccctcaccgc cgtactcgac gccatccccgccgaacaccc 14280 gctcaccgcg gtcgtccaca cggcaggcgt cctctccgac ggcaccctcccgtccatgac 14340 gacggaggac gtggaacacg tactgcggcc caaggtcgac gccgcgttcctcctcgacga 14400 actcacctcg acgcccgcat acgacctggc agcgttcgtc atgttctcctccgccgccgc 14460 cgtcttcggt ggcgcggggc agggcgccta cgccgccgcc aacgccaccctcgacgccct 14520 cgcctggcgc cgccgggcag ccggactccc cgccctctcc ctcggctggggcctctgggc 14580 cgagaccagc ggcatgaccg gcgagctcgg ccaggcggac ctgcgccggatgagccgcgc 14640 gggcatcggc gggatcagcg acgccgaggg catcgcgctc ctcgacgccgccctccgcga 14700 cgaccgccac ccggtcctgc tgcccctgcg gctcgacgcc gccgggctgcgggacgcggc 14760 cgggaacgac ccggccggaa tcccggcgct cttccgggac gtcgtcggcgccaggaccgt 14820 ccgggcccgg ccgtccgcgg cctccgcctc gacgacagcc gggacggccggcacgccggg 14880 gacggcggac ggcgcggcgg aaacggcggc ggtcacgctc gccgaccgggccgccaccgt 14940 ggacgggccc gcacggcagc gcctgctgct cgagttcgtc gtcggcgaggtcgccgaagt 15000 actcggccac gcccgcggtc accggatcga cgccgaacgg ggcttcctcgacctcggctt 15060 cgactccctg accgccgtcg aactccgcaa ccggctcaac tccgccggtggcctcgccct 15120 cccggcgacc ctggtcttcg accacccaag cccggcggca ctcgcctcccacctggacgc 15180 cgagctgccg cgcggcgcct cggaccagga cggagccggg aaccggaacgggaacgagaa 15240 cgggacgacg gcgtcccgga gcaccgccga gacggacgcg ctgctggcacaactgacccg 15300 cctggaaggc gccttggtgc tgacgggcct ctcggacgcc cccgggagcgaagaagtcct 15360 ggagcacctg cggtccctgc gctcgatggt cacgggcgag accgggaccgggaccgcgtc 15420 cggagccccg gacggcgccg ggtccggcgc cgaggaccgg ccctgggcggccggggacgg 15480 agccgggggc gggagtgagg acggcgcggg agtgccggac ttcatgaacgcctcggccga 15540 ggaactcttc ggcctcctcg accaggaccc cagcacggac tgatccctgccgcacggtcg 15600 cctcccgccc cggaccccgt cccgggcacc tcgactcgaa tcacttcatgcgcgcctcgg 15660 gcgcctccag gaactcaagg ggacagcgtg tccacggtga acgaagagaagtacctcgac 15720 tacctgcgtc gtgccacggc ggacctccac gaggcccgtg gccgcctccgcgagctggag 15780 gcgaaggcgg gcgagccggt ggcgatcgtc ggcatggcct gccgcctgcccggcggcgtc 15840 gcctcgcccg aggacctgtg gcggctggtg gccggcggcg aggacgcgatctcggagttc 15900 ccccaggacc gcggctggga cgtggagggc ctgtacgacc cgaacccggaggccacgggc 15960 aagagttacg cccgcgaggc cggattcctg tacgaggcgg gcgagttcgacgccgacttc 16020 ttcgggatct cgccgcgcga ggccctcgcc atggacccgc agcagcgtctcctcctggag 16080 gcctcctggg aggcgttcga gcacgccggg atcccggcgg ccaccgcgcgcggcacctcg 16140 gtcggcgtct tcaccggcgt gatgtaccac gactacgcca cccgtctcaccgatgtcccg 16200 gagggcatcg agggctacct gggcaccggc aactccggca gtgtcgcctcgggccgcgtc 16260 gcgtacacgc ttggcctgga ggggccggcc gtcacggtcg acaccgcctgctcgtcctcg 16320 ctggtcgccc tgcacctcgc cgtgcaggcc ctgcgcaagg gcgaggtcgacatggcgctc 16380 gccggcggcg tgacggtcat gtcgacgccc agcaccttcg tcgagttcagccgtcagcgc 16440 gggctggcgc cggacggccg gtcgaagtcc ttctcgtcga cggccgacggcaccagctgg 16500 tccgagggcg tcggcgtcct cctcgtcgag cgcctgtccg acgcgcgtcgcaagggccat 16560 cggatcctcg ccgtggtccg gggcaccgcc gtcaaccagg acggcgccagcagcggcctc 16620 acggctccga acgggccgtc gcagcagcgc gtcatccgac gtgccctggcggacgcccgg 16680 ctcacgacct ccgacgtgga cgtcgtcgag gcccacggca cgggtacgcgactcggcgac 16740 ccgatcgagg cgcaggccgt catcgccacg tacgggcagg gccgtgacggcgaacagccg 16800 ctgcgcctcg ggtcgttgaa gtccaacatc ggacacaccc aggccgccgccggtgtctcc 16860 ggcgtgatca agatggtcca ggcgatgcgc cacggcgtcc tgccgaagacgctccacgtg 16920 gagaagccga cggaccaggt ggactggtcc gcgggcgcgg tcgagctgctcaccgaggcc 16980 atggactggc cggacaaggg cgacggcgga ctgcgcaggg ccgcggtctcctccttcggc 17040 gtcagcggga cgaacgcgca cgtcgtgctc gaagaggccc cggcggccgaggagacccct 17100 gcctccgagg cgaccccggc cgtcgagccg tcggtcggcg ccggcctggtgccgtggctg 17160 gtgtcggcga agactccggc cgcgctggac gcccagatcg gacgcctcgccgcgttcgcc 17220 tcgcagggcc gtacggacgc cgccgatccg ggcgcggtcg ctcgcgtactggccggcggg 17280 cgcgccgagt tcgagcaccg ggccgtcgtg ctcggcaccg gacaggacgatttcgcgcag 17340 gcgctgaccg ctccggaagg actgatacgc ggcacgccct cggacgtgggccgggtggcg 17400 ttcgtgttcc ccggtcaggg cacgcagtgg gccgggatgg gcgccgaactcctcgacgtg 17460 tcgaaggagt tcgcggcggc catggccgag tgcgagagcg cgctctcccgctatgtcgac 17520 tggtcgctgg aggccgtcgt ccggcaggcg ccgggcgcgc ccacgctggagcgggtcgac 17580 gtcgtccagc ccgtgacctt cgctgtcatg gtttcgctgg cgaaggtctggcagcaccac 17640 ggcgtgacgc cgcaggccgt cgtcggccac tcgcagggcg agatcgccgccgcgtacgtc 17700 gccggtgccc tcaccctcga cgacgccgcc cgcgtcgtca ccctgcgcagcaagtccatc 17760 gccgcccacc tcgccggcaa gggcggcatg atctccctcg ccctcagcgaggaagccacc 17820 cggcagcgca tcgagaacct ccacggactg tcgatcgccg ccgtcaacggccccaccgcc 17880 accgtggttt cgggcgaccc cacccagatc caagagctcg ctcaggcgtgtgaggccgac 17940 ggggtccgcg cacggatcat ccccgtcgac tacgcctccc acagcgcccacgtcgagacc 18000 atcgagagcg aactcgccga ggtcctcgcc gggctcagcc cgcggacacctgaggtgccg 18060 ttcttctcga cactcgaagg cgcctggatc accgagccgg tgctcgacggcacctactgg 18120 taccgcaacc tccgccaccg cgtcggcttc gcccccgccg tcgagaccctcgccaccgac 18180 gaaggcttca cccacttcat cgaggtcagc gcccaccccg tcctcaccatgaccctcccc 18240 gagaccgtca ccggcctcgg caccctccgc cgcgaacagg gaggccaggagcgtctggtc 18300 acctcactcg ccgaagcctg gaccaacggc ctcaccatcg actgggcgcccgtcctcccc 18360 accgcaaccg gccaccaccc cgagctcccc acctacgcct tccagcgccgtcactactgg 18420 ctccacgact cccccgccgt ccagggctcc gtgcaggact cctggcgctaccgcatcgac 18480 tggaagcgcc tcgcggtcgc cgacgcgtcc gagcgcgccg ggctgtccgggcgctggctc 18540 gtcgtcgtcc ccgaggaccg ttccgccgag gccgccccgg tgctcgccgcgctgtccggc 18600 gccggcgccg accccgtaca gctggacgtg tccccgctgg gcgaccggcagcggctcgcc 18660 gcgacgctgg gcgaggccct ggcggcggcc ggtggagccg tcgacggcgtcctctcgctg 18720 ctcgcgtggg acgagagcgc gcaccccggc caccccgccc ccttcacccggggcaccggc 18780 gccaccctca ccctggtgca ggcgctggag gacgccggcg tcgccgccccgctgtggtgc 18840 gtgacccacg gcgcggtgtc cgtcggccgg gccgaccacg tcacctcccccgcccaggcc 18900 atggtgtggg gcatgggccg ggtcgccgcc ctggagcacc ccgagcggtggggcggcctg 18960 atcgacctgc cctcggacgc cgaccgggcg gccctggacc gcatgaccacggtcctcgcc 19020 ggcggtacgg gtgaggacca ggtcgcggta cgcgcctccg ggctgctcgcccgccgcctc 19080 gtccgcgcct ccctcccggc gcacggcacg gcttcgccgt ggtggcaggccgacggcacg 19140 gtgctcgtca ccggtgccga ggagcctgcg gccgccgagg ccgcacgccggctggcccgc 19200 gacggcgccg gacacctcct cctccacacc accccctccg gcagcgaaggcgccgaaggc 19260 acctccggtg ccgccgagga ctccggcctc gccgggctcg tcgccgaactcgcggacctg 19320 ggcgcgacgg ccaccgtcgt gacctgcgac ctcacggacg cggaggcggccgcccggctg 19380 ctcgccggcg tctccgacgc gcacccgctc agcgccgtcc tccacctgccgcccaccgtc 19440 gactccgagc cgctcgccgc gaccgacgcg gacgcgctcg cccgtgtcgtgaccgcgaag 19500 gccaccgccg cgctccacct ggaccgcctc ctgcgggagg ccgcggctgccggaggccgt 19560 ccgcccgtcc tggtcctctt ctcctcggtc gccgcgatct ggggcggcgccggtcagggc 19620 gcgtacgccg ccggtacggc cttcctcgac gccctcgccg gtcagcaccgggccgacggc 19680 cccaccgtga cctcggtggc ctggagcccc tgggagggca gccgcgtcaccgagggtgcg 19740 accggggagc ggctgcgccg cctcggcctg cgccccctcg cccccgcgacggcgctcacc 19800 gccctggaca ccgcgctcgg ccacggcgac accgccgtca cgatcgccgacgtcgactgg 19860 tcgagcttcg cccccggctt caccacggcc cggccgggca ccctcctcgccgatctgccc 19920 gaggcgcgcc gcgcgctcga cgagcagcag tcgacgacgg ccgccgacgacaccgtcctg 19980 agccgcgagc tcggtgcgct caccggcgcc gaacagcagc gccgtatgcaggagttggtc 20040 cgcgagcacc tcgccgtggt cctcaaccac ccctcccccg aggccgtcgacacggggcgg 20100 gccttccgtg acctcggatt cgactcgctg acggcggtcg agctccgcaaccgcctcaag 20160 aacgccaccg gcctggccct cccggccact ctggtcttcg actacccgaccccccggacg 20220 ctggcggagt tcctcctcgc ggagatcctg ggcgagcagg ccggtgccggcgagcagctt 20280 ccggtggacg gcggggtcga cgacgagccc gtcgcgatcg tcggcatggcgtgccgcctg 20340 ccgggcggtg tcgcctcgcc ggaggacctg tggcggctgg tggccggcggcgaggacgcg 20400 atctccggct tcccgcagga ccgcggctgg gacgtggagg ggctgtacgacccggacccg 20460 gacgcgtccg ggcggacgta ctgccgtgcc ggtggcttcc tcgacgaggcgggcgagttc 20520 gacgccgact tcttcgggat ctcgccgcgc gaggccctcg ccatggacccgcagcagcgg 20580 ctcctcctgg agacctcctg ggaggccgtc gaggacgccg ggatcgacccgacctccctt 20640 caggggcagc aggtcggcgt gttcgcgggc accaacggcc cccactacgagccgctgctc 20700 cgcaacaccg ccgaggatct tgagggttac gtcgggacgg gcaacgccgccagcatcatg 20760 tcgggccgtg tctcgtacac cctcggcctg gagggcccgg ccgtcacggtcgacaccgcc 20820 tgctcctcct cgctggtcgc cctgcacctc gccgtgcagg ccctgcgcaagggcgaatgc 20880 ggactggcgc tcgcgggcgg tgtgacggtc atgtcgacgc ccacgacgttcgtggagttc 20940 agccggcagc gcgggctcgc ggaggacggc cggtcgaagg cgttcgccgcgtcggcggac 21000 ggcttcggcc cggcggaggg cgtcggcatg ctcctcgtcg agcgcctgtcggacgcccgc 21060 cgcaacggac accgtgtgct ggcggtcgtg cgcggcagcg cggtcaaccaggacggcgcg 21120 agcaacggcc tgaccgcccc gaacgggccc tcgcagcagc gcgtcatccggcgcgcgctc 21180 gcggacgccc gactgacgac cgccgacgtg gacgtcgtcg aggcccacggcacgggcacg 21240 cgactcggcg acccgatcga ggcacaggcc ctcatcgcca cctacggccaggggcgcgac 21300 accgaacagc cgctgcgcct ggggtcgttg aagtccaaca tcggacacacccaggccgcc 21360 gccggtgtct ccggcatcat caagatggtc caggcgatgc gccacggcgtcctgccgaag 21420 acgctccacg tggaccggcc gtcggaccag atcgactggt cggcgggcacggtcgagctg 21480 ctcaccgagg ccatggactg gccgaggaag caggagggcg ggctgcgccgcgcggccgtc 21540 tcctccttcg gcatcagcgg cacgaacgcg cacatcgtgc tcgaagaagccccggtcgac 21600 gaggacgccc cggcggacga gccgtcggtc ggcggtgtgg tgccgtggctcgtgtccgcg 21660 aagactccgg ccgcgctgga cgcccagatc ggacgcctcg ccgcgttcgcctcgcagggc 21720 cgtacggacg ccgccgatcc gggcgcggtc gctcgcgtac tggccggcgggcgtgcgcag 21780 ttcgagcacc gggccgtcgc gctcggcacc ggacaggacg acctggcggccgcactggcc 21840 gcgcctgagg gtctggtccg gggtgtggcc tccggtgtgg gtcgagtggcgttcgtgttc 21900 ccgggacagg gcacgcagtg ggccgggatg ggtgccgaac tcctcgacgtgtcgaaggag 21960 ttcgcggcgg ccatggccga gtgcgaggcc gcgctcgctc cgtacgtggactggtcgctg 22020 gaggccgtcg tccgacaggc ccccggcgcg cccacgctgg agcgggtcgatgtcgtccag 22080 cccgtgacgt tcgccgtcat ggtctcgctg gcgaaggtct ggcagcaccacggggtgacc 22140 ccgcaagccg tcgtcggcca ctcgcagggc gagatcgccg ccgcgtacgtcgccggtgcc 22200 ctgagcctgg acgacgccgc tcgtgtcgtg accctgcgca gcaagtccatcggcgcccac 22260 ctcgcgggcc agggcggcat gctgtccctc gcgctgagcg aggcggccgttgtggagcga 22320 ctggccgggt tcgacgggct gtccgtcgcc gccgtcaacg ggcctaccgccaccgtggtt 22380 tcgggcgacc cgacccagat ccaagagctc gctcaggcgt gtgaggccgacggggtccgc 22440 gcacggatca tccccgtcga ctacgcctcc cacagcgccc acgtcgagaccatcgagagc 22500 gaactcgccg acgtcctggc ggggttgtcc ccccagacac cccaggtccccttcttctcc 22560 accctcgaag gcgcctggat caccgaaccc gccctcgacg gcggctactggtaccgcaac 22620 ctccgccatc gtgtgggctt cgccccggcc gtcgaaaccc tggccaccgacgaaggcttc 22680 acccacttcg tcgaggtcag cgcccacccc gtcctcacca tggcgctgcccgagaccgtc 22740 accggactcg gcaccctccg ccgtgacaac ggcggacagc accgcctcaccacctccctc 22800 gccgaggcct gggccaacgg cctcaccgtc gactgggcct ctctcctccccaccacgacc 22860 acccaccccg atctgcccac ctacgccttc cagaccgagc gctactggccgcagcccgac 22920 ctctccgccg ccggtgacat cacctccgcc ggtctcgggg cggccgagcacccgctgctc 22980 ggcgcggccg tggcgctcgc ggactccgac ggctgcctgc tcacggggagcctctccctc 23040 cgtacgcacc cctggctggc ggaccacgcg gtggccggca ccgtgctgctgccgggaacg 23100 gcgttcgtgg agctggcgtt ccgagccggg gaccaggtcg gttgcgatctggtcgaggag 23160 ctcaccctcg acgcgccgct cgtgctgccc cgtcgtggcg cggtccgtgtgcagctgtcc 23220 gtcggcgcga gcgacgagtc cgggcgtcgt accttcgggc tctacgcgcacccggaggac 23280 gcgccgggcg aggcggagtg gacgcggcac gccaccggtg tgctggccgcccgtgcggac 23340 cgcaccgccc ccgtcgccga cccggaggcc tggccgccgc cgggcgccgagccggtggac 23400 gtggacggtc tgtacgagcg cttcgcggcg aacggctacg gctacggccccctcttccag 23460 ggcgtccgtg gtgtctggcg gcgtggcgac gaggtgttcg ccgacgtggccctgccggcc 23520 gaggtcgccg gtgccgaggg cgcgcggttc ggccttcacc cggcgctgctcgacgccgcc 23580 gtgcaggcgg ccggtgcggg ccggggcgtt cggcgcgggc acgcggctgccgttcgcctg 23640 gagcgggatc tcctgtacgc ggtcggcgcc accgccctcc gcgtgcggctggcccccgcc 23700 ggcccggaca cggtgtccgt gagcgccgcc gactcctccg ggcagccggtgttcgccgcg 23760 gactccctca cggtgctgcc cgtcgacccc gcgcagctgg cggccttcagcgacccgact 23820 ctggacgcgc tgcacctgct ggagtggacc gcctgggacg gtgccgcgcaggccctgccc 23880 ggcgcggtcg tgctgggcgg cgacgccgac ggtctcgccg cggcgctgcgcgccggtggc 23940 accgaggtcc tgtccttccc ggaccttacg gacctggtgg aggccgtcgaccggggcgag 24000 accccggccc cggcgaccgt cctggtggcc tgccccgccg ccggccccgatgggccggag 24060 catgtccgcg aggccctgca cgggtcgctc gcgctgatgc aggcctggctggccgacgag 24120 cggttcaccg atgggcgcct ggtgctcgtg acccgcgacg cggtcgccgcccgttccggc 24180 gacggcctgc ggtccacggg acaggccgcc gtctggggcc tcggccggtccgcgcagacg 24240 gagagcccgg gccggttcgt cctgctcgac ctcgccgggg aagcccggacggccggggac 24300 gccaccgccg gggacggcct gacgaccggg gacgccaccg tcggcggcacctctggagac 24360 gccgccctcg gcagcgccct cgcgaccgcc ctcggctcgg gcgagccgcagctcgccctc 24420 cgggacgggg cgctcctcgt accccgcctg gcgcgggccg ccgcgcccgccgcggccgac 24480 ggcctcgccg cggccgacgg cctcgccgct ctgccgctgc ccgccgctccggccctctgg 24540 cgtctggagc ccggtacgga cggcagcctg gagagcctca cggcggcgcccggcgacgcc 24600 gagaccctcg ccccggagcc gctcggcccg ggacaggtcc gcatcgcgatccgggccacc 24660 ggtctcaact tccgcgacgt cctgatcgcc ctcggcatgt accccgatccggcgctgatg 24720 ggcaccgagg gagccggcgt ggtcaccgcg accggccccg gcgtcacgcacctcgccccc 24780 ggcgaccggg tcatgggcct gctctccggc gcgtacgccc cggtcgtcgtggcggacgcg 24840 cggaccgtcg cgcggatgcc cgaggggtgg acgttcgccc agggcgcctccgtgccggtg 24900 gtgttcctga cggccgtcta cgccctgcgc gacctggcgg acgtcaagcccggcgagcgc 24960 ctcctggtcc actccgccgc cggtggcgtg ggcatggccg ccgtgcagctcgcccggcac 25020 tggggcgtgg aggtccacgg cacggcgagt cacgggaagt gggacgccctgcgcgcgctc 25080 ggcctggacg acgcgcacat cgcctcctcc cgcaccctgg acttcgagtccgcgttccgt 25140 gccgcttccg gcggggcggg catggacgtc gtactgaact cgctcgcccgcgagttcgtc 25200 gacgcctcgc tgcgcctgct cgggccgggc ggccggttcg tggagatggggaagaccgac 25260 gtccgcgacg cggagcgggt cgccgccgac caccccggtg tcggctaccgcgccttcgac 25320 ctgggcgagg ccgggccgga gcggatcggc gagatgctcg ccgaggtcatcgccctcttc 25380 gaggacgggg tgctccggca cctgcccgtc acgacctggg acgtgcgccgggcccgcgac 25440 gccttccggc acgtcagcca ggcccgccac acgggcaagg tcgtcctcacgatgccgtcg 25500 ggcctcgacc cggagggtac ggtcctgctg accggcggca ccggtgcgctggggggcatc 25560 gtggcccggc acgtggtggg cgagtggggc gtacgacgcc tgctgctcgtgagccggcgg 25620 ggcacggacg ccccgggcgc cggcgagctc gtgcacgagc tggaggccctgggagccgac 25680 gtctcggtgg ccgcgtgcga cgtcgccgac cgcgaagccc tcaccgccgtactcgactcg 25740 atccccgccg aacacccgct caccgcggtc gtccacacgg caggcgtcctctccgacggc 25800 accctcccct cgatgacagc ggaggatgtg gaacacgtac tgcgtcccaaggtcgacgcc 25860 gcgttcctcc tcgacgaact cacctcgacg cccggctacg acctggcagcgttcgtcatg 25920 ttctcctccg ccgccgccgt cttcggtggc gcggggcagg gcgcctacgccgccgccaac 25980 gccaccctcg acgccctcgc ctggcgccgc cggacagccg gactccccgccctctccctc 26040 ggctggggcc tctgggccga gaccagcggc atgaccggcg gactcagcgacaccgaccgc 26100 tcgcggctgg cccgttccgg ggcgacgccc atggacagcg agctgaccctgtccctcctg 26160 gacgcggcca tgcgccgcga cgacccggcg ctcgtcccga tcgccctggacgtcgccgcg 26220 ctccgcgccc agcagcgcga cggcatgctg gcgccgctgc tcagcgggctcacccgcgga 26280 tcgcgggtcg gcggcgcgcc ggtcaaccag cgcagggcag ccgccggaggcgcgggcgag 26340 gcggacacgg acctcggcgg gcggctcgcc gcgatgacac cggacgaccgggtcgcgcac 26400 ctgcgggacc tcgtccgtac gcacgtggcg accgtcctgg gacacggcaccccgagccgg 26460 gtggacctgg agcgggcctt ccgcgacacc ggtttcgact cgctcaccgccgtcgaactc 26520 cgcaaccgtc tcaacgccgc gaccgggctg cggctgccgg ccacgctggtcttcgaccac 26580 cccaccccgg gggagctcgc cgggcacctg ctcgacgaac tcgccacggccgcgggcggg 26640 tcctgggcgg aaggcaccgg gtccggagac acggcctcgg cgaccgatcggcagaccacg 26700 gcggccctcg ccgaactcga ccggctggaa ggcgtgctcg cctccctcgcgcccgccgcc 26760 ggcggccgtc cggagctcgc cgcccggctc agggcgctgg ccgcggccctgggggacgac 26820 ggcgacgacg ccaccgacct ggacgaggcg tccgacgacg acctcttctccttcatcgac 26880 aaggagctgg gcgactccga cttctgacct gcccgacacc accggcaccaccggcaccac 26940 cagcccccct cacacacgga acacggaacg gacaggcgag aacgggagccatggcgaaca 27000 acgaagacaa gctccgcgac tacctcaagc gcgtcaccgc cgagctgcagcagaacacca 27060 ggcgtctgcg cgagatcgag ggacgcacgc acgagccggt ggcgatcgtgggcatggcct 27120 gccgcctgcc gggcggtgtc gcctcgcccg aggacctgtg gcagctggtggccggggacg 27180 gggacgcgat ctcggagttc ccgcaggacc gcggctggga cgtggaggggctgtacgacc 27240 ccgacccgga cgcgtccggc aggacgtact gccggtccgg cggattcctgcacgacgccg 27300 gcgagttcga cgccgacttc ttcgggatct cgccgcgcga ggccctcgccatggacccgc 27360 agcagcgact gtccctcacc accgcgtggg aggcgatcga gagcgcgggcatcgacccga 27420 cggccctgaa gggcagcggc ctcggcgtct tcgtcggcgg ctggcacaccggctacacct 27480 cggggcagac caccgccgtg cagtcgcccg agctggaggg ccacctggtcagcggcgcgg 27540 cgctgggctt cctgtccggc cgtatcgcgt acgtcctcgg tacggacggaccggccctga 27600 ccgtggacac ggcctgctcg tcctcgctgg tcgccctgca cctcgccgtgcaggccctcc 27660 gcaagggcga gtgcgacatg gccctcgccg gtggtgtcac ggtcatgcccaacgcggacc 27720 tgttcgtgca gttcagccgg cagcgcgggc tggccgcgga cggccggtcgaaggcgttcg 27780 ccacctcggc ggacggcttc ggccccgcgg agggcgccgg agtcctgctggtggagcgcc 27840 tgtcggacgc ccgccgcaac ggacaccgga tcctcgcggt cgtccgcggcagcgcggtca 27900 accaggacgg cgccagcaac ggcctcacgg ctccgcacgg gccctcccagcagcgcgtca 27960 tccgacgggc cctggcggac gcccggctcg cgccgggtga cgtggacgtcgtcgaggcgc 28020 acggcacggg cacgcggctc ggcgacccga tcgaggcgca ggccctcatcgccacctacg 28080 gccaggagaa gagcagcgaa cagccgctga ggctgggcgc gttgaagtcgaacatcgggc 28140 acacgcaggc cgcggccggt gtcgcaggtg tcatcaagat ggtccaggcgatgcgccacg 28200 gactgctgcc gaagacgctg cacgtcgacg agccctcgga ccagatcgactggtcggcgg 28260 gcacggtgga actcctcacc gaggccgtcg actggccgga gaagcaggacggcgggctgc 28320 gccgcgcggc tgtctcctcc ttcggcatca gcgggacgaa cgcgcacgtcgtcctggagg 28380 aggccccggc ggtcgaggac tccccggccg tcgagccgcc ggccggtggcggtgtggtgc 28440 cgtggccggt gtccgcgaag actccggccg cgctggacgc ccagatcgggcagctcgccg 28500 cgtacgcgga cggtcgtacg gacgtggatc cggcggtggc cgcccgcgccctggtcgaca 28560 gccgtacggc gatggagcac cgcgcggtcg cggtcggcga cagccgggaggcactgcggg 28620 acgccctgcg gatgccggaa ggactggtac gcggcacgtc ctcggacgtgggccgggtgg 28680 cgttcgtctt ccccggccag ggcacgcagt gggccggcat gggcgccgaactccttgaca 28740 gctcaccgga gttcgctgcc tcgatggccg aatgcgagac cgcgctctcccgctacgtcg 28800 actggtctct tgaagccgtc gtccgacagg aacccggcgc acccacgctcgaccgcgtcg 28860 acgtcgtcca gcccgtgacc ttcgctgtca tggtctcgct ggcgaaggtctggcagcacc 28920 acggcatcac cccccaggcc gtcgtcggcc actcgcaggg cgagatcgccgccgcgtacg 28980 tcgccggtgc actcaccctc gacgacgccg cccgcgtcgt caccctgcgcagcaagtcca 29040 tcgccgccca cctcgccggc aagggcggca tgatctccct cgccctcgacgaggcggccg 29100 tcctgaagcg actgagcgac ttcgacggac tctccgtcgc cgccgtcaacggccccaccg 29160 ccaccgtcgt ctccggcgac ccgacccaga tcgaggaact cgcccgcacctgcgaggccg 29220 acggcgtccg tgcgcggatc atcccggtcg actacgcctc ccacagccggcaggtcgaga 29280 tcatcgagaa ggagctggcc gaggtcctcg ccggactcgc cccgcaggctccgcacgtgc 29340 cgttcttctc caccctcgaa ggcacctgga tcaccgagcc ggtgctcgacggcacctact 29400 ggtaccgcaa cctgcgccat cgcgtgggct tcgcccccgc cgtggagaccttggcggttg 29460 acggcttcac ccacttcatc gaggtcagcg cccaccccgt cctcaccatgaccctccccg 29520 agaccgtcac cggcctcggc accctccgcc gcgaacaggg aggccaggagcgtctggtca 29580 cctcactcgc cgaagcctgg gccaacggcc tcaccatcga ctgggcgcccatcctcccca 29640 ccgcaaccgg ccaccacccc gagctcccca cctacgcctt ccagaccgagcgcttctggc 29700 tgcagagctc cgcgcccacc agcgccgccg acgactggcg ttaccgcgtcgagtggaagc 29760 cgctgacggc ctccggccag gcggacctgt ccgggcggtg gatcgtcgccgtcgggagcg 29820 agccagaagc cgagctgctg ggcgcgctga aggccgcggg agcggaggtcgacgtactgg 29880 aagccggggc ggacgacgac cgtgaggccc tcgccgcccg gctcaccgcactgacgaccg 29940 gcgacggctt caccggcgtg gtctcgctcc tcgacgacct cgtgccacaggtcgcctggg 30000 tgcaggcact cggcgacgcc ggaatcaagg cgcccctgtg gtccgtcacccagggcgcgg 30060 tctccgtcgg acgtctcgac acccccgccg accccgaccg ggccatgctctggggcctcg 30120 gccgcgtcgt cgcccttgag caccccgaac gctgggccgg cctcgtcgacctccccgccc 30180 agcccgatgc cgccgccctc gcccacctcg tcaccgcact ctccggcgccaccggcgagg 30240 accagatcgc catccgcacc accggactcc acgcccgccg cctcgcccgcgcacccctcc 30300 acggacgtcg gcccacccgc gactggcagc cccacggcac cgtcctcatcaccggcggca 30360 ccggagccct cggcagccac gccgcacgct ggatggccca ccacggagccgaacacctcc 30420 tcctcgtcag ccgcagcggc gaacaagccc ccggagccac ccaactcaccgccgaactca 30480 ccgcatcggg cgcccgcgtc accatcgccg cctgcgacgt cgccgacccccacgccatgc 30540 gcaccctcct cgacgccatc cccgccgaga cgcccctcac cgccgtcgtccacaccgccg 30600 gcgcaccggg cggcgatccg ctggacgtca ccggcccgga ggacatcgcccgcatcctgg 30660 gcgcgaagac gagcggcgcc gaggtcctcg acgacctgct ccgcggcactccgctggacg 30720 ccttcgtcct ctactcctcg aacgccgggg tctggggcag cggcagccagggcgtctacg 30780 cggcggccaa cgcccacctc gacgcgctcg ccgcccggcg ccgcgcccggggcgagacgg 30840 cgacctcggt cgcctggggc ctctgggccg gcgacggcat gggccggggcgccgacgacg 30900 cgtactggca gcgtcgcggc atccgtccga tgagccccga ccgcgccctggacgaactgg 30960 ccaaggccct gagccacgac gagaccttcg tcgccgtggc cgatgtcgactgggagcggt 31020 tcgcgcccgc gttcacggtg tcccgtccca gccttctgct cgacggcgtcccggaggccc 31080 ggcaggcgct cgccgcaccc gtcggtgccc cggctcccgg cgacgccgccgtggcgccga 31140 ccgggcagtc gtcggcgctg gccgcgatca ccgcgctccc cgagcccgagcgccggccgg 31200 cgctcctcac cctcgtccgt acccacgcgg cggccgtact cggccattcctcccccgacc 31260 gggtggcccc cggccgtgcc ttcaccgagc tcggcttcga ctcgctgacggccgtgcagc 31320 tccgcaacca gctctccacg gtggtcggca acaggctccc cgccaccacggtcttcgacc 31380 acccgacgcc cgccgcactc gccgcgcacc tccacgaggc gtacctcgcaccggccgagc 31440 cggccccgac ggactgggag gggcgggtgc gccgggccct ggccgaactgcccctcgacc 31500 ggctgcggga cgcgggggtc ctcgacaccg tcctgcgcct caccggcatcgagcccgagc 31560 cgggttccgg cggttcggac ggcggcgccg ccgaccctgg tgcggagccggaggcgtcga 31620 tcgacgacct ggacgccgag gccctgatcc ggatggctct cggcccccgtaacacctgac 31680 ccgaccgcgg tcctgcccca cgcgccgcac cccgcgcatc ccgcgcaccacccgccccca 31740 cacgcccaca accccatcca cgagcggaag accacaccca gatgacgagttccaacgaac 31800 agttggtgga cgctctgcgc gcctctctca aggagaacga agaactccggaaagagagcc 31860 gtcgccgggc cgaccgtcgg caggagccca tggcgatcgt cggcatgagctgccggttcg 31920 cgggcggaat ccggtccccc gaggacctct gggacgccgt cgccgcgggcaaggacctgg 31980 tctccgaggt accggaggag cgcggctggg acatcgactc cctctacgacccggtgcccg 32040 ggcgcaaggg cacgacgtac gtccgcaacg ccgcgttcct cgacgacgccgccggattcg 32100 acgcggcctt cttcgggatc tcgccgcgcg aggccctcgc catggacccgcagcagcggc 32160 agctcctcga agcctcctgg gaggtcttcg agcgggccgg catcgaccccgcgtcggtcc 32220 gcggcaccga cgtcggcgtg tacgtgggct gtggctacca ggactacgcgccggacatcc 32280 gggtcgcccc cgaaggcacc ggcggttacg tcgtcaccgg caactcctccgccgtggcct 32340 ccgggcgcat cgcgtactcc ctcggcctgg agggacccgc cgtgaccgtggacacggcgt 32400 gctcctcttc gctcgtcgcc ctgcacctcg ccctgaaggg cctgcggaacggcgactgct 32460 cgacggcact cgtgggcggc gtggccgtcc tcgcgacgcc gggcgcgttcatcgagttca 32520 gcagccagca ggccatggcc gccgacggcc ggaccaaggg cttcgcctcggcggcggacg 32580 gcctcgcctg gggcgagggc gtcgccgtac tcctcctcga acggctctccgacgcgcggc 32640 gcaagggcca ccgggtcctg gccgtcgtgc gcggcagcgc catcaaccaggacggcgcga 32700 gcaacggcct cacggctccg cacgggccct cccagcagca cctgatccgccaggccctgg 32760 ccgacgcgcg gctcacgtcg agcgacgtgg acgtcgtgga gggccacggcacggggaccc 32820 gtctcggcga cccgatcgag gcgcaggcgc tgctcgccac gtacgggcaggggcgcgccc 32880 cggggcagcc gctgcggctg gggacgctga agtcgaacat cgggcacacgcaggccgctt 32940 cgggtgtcgc cggtgtcatc aagatggtgc aggcgctgcg ccacggggtgctgccgaaga 33000 ccctgcacgt ggacgagccg acggaccagg tcgactggtc ggccggttcggtcgagctgc 33060 tcaccgaggc cgtggactgg ccggagcggc cgggccggct ccgccgggcgggcgtctccg 33120 cgttcggcgt gggcgggacg aacgcgcacg tcgtcctgga ggaggccccggcggtcgagg 33180 agtcccctgc cgtcgagccg ccggccggtg gcggcgtggt gccgtggccggtgtccgcga 33240 agacctcggc cgcactggac gcccagatcg ggcagctcgc cgcatacgcggaagaccgca 33300 cggacgtgga tccggcggtg gccgcccgcg ccctggtcga cagccgtacggcgatggagc 33360 accgcgcggt cgcggtcggc gacagccggg aggcactgcg ggacgccctgcggatgccgg 33420 aaggactggt acggggcacg gtcaccgatc cgggccgggt ggcgttcgtcttccccggcc 33480 agggcacgca gtgggccggc atgggcgccg aactcctcga cagctcacccgaattcgccg 33540 ccgccatggc cgaatgcgag accgcactct ccccgtacgt cgactggtctctcgaagccg 33600 tcgtccgaca ggctcccagc gcaccgacac tcgaccgcgt cgacgtcgtccagcccgtca 33660 ccttcgccgt catggtctcc ctcgccaagg tctggcagca ccacggcatcacccccgagg 33720 ccgtcatcgg ccactcccag ggcgagatcg ccgccgcgta cgtcgccggtgccctcaccc 33780 tcgacgacgc cgctcgtgtc gtgaccctcc gcagcaagtc catcgccgcccacctcgccg 33840 gcaagggcgg catgatctcc ctcgccctca gcgaggaagc cacccggcagcgcatcgaga 33900 acctccacgg actgtcgatc gccgccgtca acgggcctac cgccaccgtggtttcgggcg 33960 accccaccca gatccaagaa cttgctcagg cgtgtgaggc cgacggcatccgcgcacgga 34020 tcatccccgt cgactacgcc tcccacagcg cccacgtcga gaccatcgagaacgaactcg 34080 ccgacgtcct ggcggggttg tccccccaga caccccaggt ccccttcttctccaccctcg 34140 aaggcacctg gatcaccgaa cccgccctcg acggcggcta ctggtaccgcaacctccgcc 34200 atcgtgtggg cttcgccccg gccgtcgaga ccctcgccac cgacgaaggcttcacccact 34260 tcatcgaggt cagcgcccac cccgtcctca ccatgaccct ccccgacaaggtcaccggcc 34320 tggccaccct ccgacgcgag gacggcggac agcaccgcct caccacctcccttgccgagg 34380 cctgggccaa cggcctcgcc ctcgactggg cctccctcct gcccgccacgggcgccctca 34440 gccccgccgt ccccgacctc ccgacgtacg ccttccagca ccgctcgtactggatcagcc 34500 ccgcgggtcc cggcgaggcg cccgcgcaca ccgcttccgg gcgcgaggccgtcgccgaga 34560 cggggctcgc gtggggcccg ggtgccgagg acctcgacga ggagggccggcgcagcgccg 34620 tactcgcgat ggtgatgcgg caggcggcct ccgtgctccg gtgcgactcgcccgaagagg 34680 tccccgtcga ccgcccgctg cgggagatcg gcttcgactc gctgaccgccgtcgacttcc 34740 gcaaccgcgt caaccggctg accggtctcc agctgccgcc caccgtcgtgttccagcacc 34800 cgacgcccgt cgcgctcgcc gagcgcatca gcgacgagct ggccgagcggaactgggccg 34860 tcgccgagcc gtcggatcac gagcaggcgg aggaggagaa ggccgccgctccggcggggg 34920 cccgctccgg ggccgacacc ggcgccggcg ccgggatgtt ccgcgccctgttccggcagg 34980 ccgtggagga cgaccggtac ggcgagttcc tcgacgtcct cgccgaagcctccgcgttcc 35040 gcccgcagtt cgcctcgccc gaggcctgct cggagcggct cgacccggtgctgctcgccg 35100 gcggtccgac ggaccgggcg gaaggccgtg ccgttctcgt cggctgcaccggcaccgcgg 35160 cgaacggcgg cccgcacgag ttcctgcggc tcagcacctc cttccaggaggagcgggact 35220 tcctcgccgt acctctcccc ggctacggca cgggtacggg caccggcacggccctcctcc 35280 cggccgatct cgacaccgcg ctcgacgccc aggcccgggc gatcctccgggccgccgggg 35340 acgccccggt cgtcctgctc gggcactccg gcggcgccct gctcgcgcacgagctggcct 35400 tccgcctgga gcgggcgcac ggcgcgccgc cggccgggat cgtcctggtcgacccctatc 35460 cgccgggcca tcaggagccc atcgaggtgt ggagcaggca gctgggcgagggcctgttcg 35520 cgggcgagct ggagccgatg tccgatgcgc ggctgctggc catgggccggtacgcgcggt 35580 tcctcgccgg cccgcggccg ggccgcagca gcgcgcccgt gcttctggtccgtgcctccg 35640 aaccgctggg cgactggcag gaggagcggg gcgactggcg tgcccactgggaccttccgc 35700 acaccgtcgc ggacgtgccg ggcgaccact tcacgatgat gcgggaccacgcgccggccg 35760 tcgccgaggc cgtcctctcc tggctcgacg ccatcgaggg catcgagggggcgggcaagt 35820 gaccgacaga cctctgaacg tggacagcgg actgtggatc cggcgcttccaccccgcgcc 35880 gaacagcgcg gtgcggctgg tctgcctgcc gcacgccggc ggctccgccagctacttctt 35940 ccgcttctcg gaggagctgc acccctccgt cgaggccctg tcggtgcagtatccgggccg 36000 ccaggaccgg cgtgccgagc cgtgtctgga gagcgtcgag gagctcgccgagcatgtggt 36060 cgcggccacc gaaccctggt ggcaggaggg ccggctggcc ttcttcgggcacagcctcgg 36120 cgcctccgtc gccttcgaga cggcccgcat cctggaacag cggcacggggtacggcccga 36180 gggcctgtac gtctccggtc ggcgcgcccc gtcgctggcg ccggaccggctcgtccacca 36240 gctggacgac cgggcgttcc tggccgagat ccggcggctc agcggcaccgacgagcggtt 36300 cctccaggac gacgagctgc tgcggctggt gctgcccgcg ctgcgcagcgactacaaggc 36360 ggcggagacg tacctgcacc ggccgtccgc caagctcacc tgcccggtgatggccctggc 36420 cggcgaccgt gacccgaagg cgccgctgaa cgaggtggcc gagtggcgtcggcacaccag 36480 cgggccgttc tgcctccggg cgtactccgg cggccacttc tacctcaacgaccagtggca 36540 cgagatctgc aacgacatct ccgaccacct gctcgtcacc cgcggcgcgcccgatgcccg 36600 cgtcgtgcag cccccgacca gccttatcga aggagcggcg aagagatggcagaacccacg 36660 gtgaccgacg acctgacggg ggccctcacg cagcccccgc tgggccgcaccgtccgcgcg 36720 gtggccgacc gtgaactcgg cacccacctc ctggagaccc gcggcatccactggatcc 36778 6 11877 PRT Streptomyces venezuelae 6 Met Ala Met Arg AspSer Ile Pro Arg Arg Ala Asp Arg Asp Thr Leu 1 5 10 15 Arg Arg Glu LeuGly Gln Asn Phe Leu Gln Asp Asp Arg Ala Val Arg 20 25 30 Asn Leu Val ThrHis Val Glu Gly Asp Gly Arg Asn Val Leu Glu Ile 35 40 45 Gly Pro Gly LysGly Ala Ile Thr Glu Glu Leu Val Arg Ser Phe Asp 50 55 60 Thr Val Thr ValVal Glu Met Asp Pro His Trp Ala Ala His Val Arg 65 70 75 80 Arg Lys PheGlu Gly Glu Arg Val Thr Val Phe Gln Gly Asp Phe Leu 85 90 95 Asp Phe ArgIle Pro Arg Asp Ile Asp Thr Val Val Gly Asn Val Pro 100 105 110 Phe GlyIle Thr Thr Gln Ile Leu Arg Ser Leu Leu Glu Ser Thr Asn 115 120 125 TrpGln Ser Ala Ala Leu Ile Val Gln Trp Glu Val Ala Arg Lys Arg 130 135 140Ala Gly Arg Ser Gly Gly Ser Leu Leu Thr Thr Ser Trp Ala Pro Trp 145 150155 160 Tyr Glu Phe Ala Val His Asp Arg Val Arg Ala Ser Ser Phe Arg Pro165 170 175 Met Pro Arg Val Asp Gly Gly Val Leu Thr Ile Arg Arg Arg ProGln 180 185 190 Pro Leu Leu Pro Glu Ser Ala Ser Arg Ala Phe Gln Asn PheAla Glu 195 200 205 Ala Val Phe Thr Gly Pro Gly Arg Gly Leu Ala Glu IleLeu Arg Arg 210 215 220 His Ile Pro Lys Arg Thr Tyr Arg Ser Leu Ala AspArg His Gly Ile 225 230 235 240 Pro Asp Gly Gly Leu Pro Lys Asp Leu ThrLeu Thr Gln Trp Ile Ala 245 250 255 Leu Phe Gln Ala Ser Gln Pro Ser TyrAla Pro Gly Ala Pro Gly Thr 260 265 270 Arg Met Pro Gly Gln Gly Gly GlyAla Gly Gly Arg Asp Tyr Asp Ser 275 280 285 Glu Thr Ser Arg Ala Ala ValPro Gly Ser Arg Arg Tyr Gly Pro Thr 290 295 300 Arg Gly Gly Glu Pro CysAla Pro Arg Ala Gln Val Arg Gln Thr Lys 305 310 315 320 Gly Arg Gln GlyAla Arg Gly Ser Ser Tyr Gly Arg Arg Thr Gly Arg 325 330 335 Met Ser SerAla Gly Ile Thr Arg Thr Gly Ala Arg Thr Pro Val Thr 340 345 350 Gly ArgGly Ala Ala Ala Trp Asp Thr Gly Glu Val Arg Val Arg Arg 355 360 365 GlyLeu Pro Pro Ala Gly Pro Asp His Ala Glu His Ser Phe Ser Arg 370 375 380Ala Pro Thr Gly Asp Val Arg Ala Glu Leu Ile Arg Gly Glu Met Ser 385 390395 400 Thr Val Ser Lys Ser Glu Ser Glu Glu Phe Val Ser Val Ser Asn Asp405 410 415 Ala Gly Ser Ala His Gly Thr Ala Glu Pro Val Ala Val Val GlyIle 420 425 430 Ser Cys Arg Val Pro Gly Ala Arg Asp Pro Arg Glu Phe TrpGlu Leu 435 440 445 Leu Ala Ala Gly Gly Gln Ala Val Thr Asp Val Pro AlaAsp Arg Trp 450 455 460 Asn Ala Gly Asp Phe Tyr Asp Pro Asp Arg Ser AlaPro Gly Arg Ser 465 470 475 480 Asn Ser Arg Trp Gly Gly Phe Ile Glu AspVal Asp Arg Phe Asp Ala 485 490 495 Ala Phe Phe Gly Ile Ser Pro Arg GluAla Ala Glu Met Asp Pro Gln 500 505 510 Gln Arg Leu Ala Leu Glu Leu GlyTrp Glu Ala Leu Glu Arg Ala Gly 515 520 525 Ile Asp Pro Ser Ser Leu ThrGly Thr Arg Thr Gly Val Phe Ala Gly 530 535 540 Ala Ile Trp Asp Asp TyrAla Thr Leu Lys His Arg Gln Gly Gly Ala 545 550 555 560 Ala Ile Thr ProHis Thr Val Thr Gly Leu His Arg Gly Ile Ile Ala 565 570 575 Asn Arg LeuSer Tyr Thr Leu Gly Leu Arg Gly Pro Ser Met Val Val 580 585 590 Asp SerGly Gln Ser Ser Ser Leu Val Ala Val His Leu Ala Cys Glu 595 600 605 SerLeu Arg Arg Gly Glu Ser Glu Leu Ala Leu Ala Gly Gly Val Ser 610 615 620Leu Asn Leu Val Pro Asp Ser Ile Ile Gly Ala Ser Lys Phe Gly Gly 625 630635 640 Leu Ser Pro Asp Gly Arg Ala Tyr Thr Phe Asp Ala Arg Ala Asn Gly645 650 655 Tyr Val Arg Gly Glu Gly Gly Gly Phe Val Val Leu Lys Arg LeuSer 660 665 670 Arg Ala Val Ala Asp Gly Asp Pro Val Leu Ala Val Ile ArgGly Ser 675 680 685 Ala Val Asn Asn Gly Gly Ala Ala Gln Gly Met Thr ThrPro Asp Ala 690 695 700 Gln Ala Gln Glu Ala Val Leu Arg Glu Ala His GluArg Ala Gly Thr 705 710 715 720 Ala Pro Ala Asp Val Arg Tyr Val Glu LeuHis Gly Thr Gly Thr Pro 725 730 735 Val Gly Asp Pro Ile Glu Ala Ala AlaLeu Gly Ala Ala Leu Gly Thr 740 745 750 Gly Arg Pro Ala Gly Gln Pro LeuLeu Val Gly Ser Val Lys Thr Asn 755 760 765 Ile Gly His Leu Glu Gly AlaAla Gly Ile Ala Gly Leu Ile Lys Ala 770 775 780 Val Leu Ala Val Arg GlyArg Ala Leu Pro Ala Ser Leu Asn Tyr Glu 785 790 795 800 Thr Pro Asn ProAla Ile Pro Phe Glu Glu Leu Asn Leu Arg Val Asn 805 810 815 Thr Glu TyrLeu Pro Trp Glu Pro Glu His Asp Gly Gln Arg Met Val 820 825 830 Val GlyVal Ser Ser Phe Gly Met Gly Gly Thr Asn Ala His Val Val 835 840 845 LeuGlu Glu Ala Pro Gly Gly Cys Arg Gly Ala Ser Val Val Glu Ser 850 855 860Thr Val Gly Gly Ser Ala Val Gly Gly Gly Val Val Pro Trp Val Val 865 870875 880 Ser Ala Lys Ser Ala Ala Ala Leu Asp Ala Gln Ile Glu Arg Leu Ala885 890 895 Ala Phe Ala Ser Arg Asp Arg Thr Asp Gly Val Asp Ala Gly AlaVal 900 905 910 Asp Ala Gly Ala Val Asp Ala Gly Ala Val Ala Arg Val LeuAla Gly 915 920 925 Gly Arg Ala Gln Phe Glu His Arg Ala Val Val Val GlySer Gly Pro 930 935 940 Asp Asp Leu Ala Ala Ala Leu Ala Ala Pro Glu GlyLeu Val Arg Gly 945 950 955 960 Val Ala Ser Gly Val Gly Arg Val Ala PheVal Phe Pro Gly Gln Gly 965 970 975 Thr Gln Trp Ala Gly Met Gly Ala GluLeu Leu Asp Ser Ser Ala Val 980 985 990 Phe Ala Ala Ala Met Ala Glu CysGlu Ala Ala Leu Ser Pro Tyr Val 995 1000 1005 Asp Trp Ser Leu Glu AlaVal Val Arg Gln Ala Pro Gly Ala Pro Thr 1010 1015 1020 Leu Glu Arg ValAsp Val Val Gln Pro Val Thr Phe Ala Val Met Val 1025 1030 1035 1040 SerLeu Ala Arg Val Trp Gln His His Gly Val Thr Pro Gln Ala Val 1045 10501055 Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala1060 1065 1070 Leu Ser Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg SerLys Ser 1075 1080 1085 Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met LeuSer Leu Ala Leu 1090 1095 1100 Ser Glu Asp Ala Val Leu Glu Arg Leu AlaGly Phe Asp Gly Leu Ser 1105 1110 1115 1120 Val Ala Ala Val Asn Gly ProThr Ala Thr Val Val Ser Gly Asp Pro 1125 1130 1135 Val Gln Ile Glu GluLeu Ala Arg Ala Cys Glu Ala Asp Gly Val Arg 1140 1145 1150 Ala Arg ValIle Pro Val Asp Tyr Ala Ser His Ser Arg Gln Val Glu 1155 1160 1165 IleIle Glu Ser Glu Leu Ala Glu Val Leu Ala Gly Leu Ser Pro Gln 1170 11751180 Ala Pro Arg Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp Ile Thr1185 1190 1195 1200 Glu Pro Val Leu Asp Gly Gly Tyr Trp Tyr Arg Asn LeuArg His Arg 1205 1210 1215 Val Gly Phe Ala Pro Ala Val Glu Thr Leu AlaThr Asp Glu Gly Phe 1220 1225 1230 Thr His Phe Val Glu Val Ser Ala HisPro Val Leu Thr Met Ala Leu 1235 1240 1245 Pro Gly Thr Val Thr Gly LeuAla Thr Leu Arg Arg Asp Asn Gly Gly 1250 1255 1260 Gln Asp Arg Leu ValAla Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu 1265 1270 1275 1280 Ala ValAsp Trp Ser Pro Leu Leu Pro Ser Ala Thr Gly His His Ser 1285 1290 1295Asp Leu Pro Thr Tyr Ala Phe Gln Thr Glu Arg His Trp Leu Gly Glu 13001305 1310 Ile Glu Ala Leu Ala Pro Ala Gly Glu Pro Ala Val Gln Pro AlaVal 1315 1320 1325 Leu Arg Thr Glu Ala Ala Glu Pro Ala Glu Leu Asp ArgAsp Glu Gln 1330 1335 1340 Leu Arg Val Ile Leu Asp Lys Val Arg Ala GlnThr Ala Gln Val Leu 1345 1350 1355 1360 Gly Tyr Ala Thr Gly Gly Gln IleGlu Val Asp Arg Thr Phe Arg Glu 1365 1370 1375 Ala Gly Cys Thr Ser LeuThr Gly Val Asp Leu Arg Asn Arg Ile Asn 1380 1385 1390 Ala Ala Phe GlyVal Arg Met Ala Pro Ser Met Ile Phe Asp Phe Pro 1395 1400 1405 Thr ProGlu Ala Leu Ala Glu Gln Leu Leu Leu Val Val His Gly Glu 1410 1415 1420Ala Ala Ala Asn Pro Ala Gly Ala Glu Pro Ala Pro Val Ala Ala Ala 14251430 1435 1440 Gly Ala Val Asp Glu Pro Val Ala Ile Val Gly Met Ala CysArg Leu 1445 1450 1455 Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp ArgLeu Val Ala Gly 1460 1465 1470 Gly Gly Asp Ala Ile Ser Glu Phe Pro GlnAsp Arg Gly Trp Asp Val 1475 1480 1485 Glu Gly Leu Tyr His Pro Asp ProGlu His Pro Gly Thr Ser Tyr Val 1490 1495 1500 Arg Gln Gly Gly Phe IleGlu Asn Val Ala Gly Phe Asp Ala Ala Phe 1505 1510 1515 1520 Phe Gly IleSer Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg 1525 1530 1535 LeuLeu Leu Glu Thr Ser Trp Glu Ala Val Glu Asp Ala Gly Ile Asp 1540 15451550 Pro Thr Ser Leu Arg Gly Arg Gln Val Gly Val Phe Thr Gly Ala Met1555 1560 1565 Thr His Glu Tyr Gly Pro Ser Leu Arg Asp Gly Gly Glu GlyLeu Asp 1570 1575 1580 Gly Tyr Leu Leu Thr Gly Asn Thr Ala Ser Val MetSer Gly Arg Val 1585 1590 1595 1600 Ser Tyr Thr Leu Gly Leu Glu Gly ProAla Leu Thr Val Asp Thr Ala 1605 1610 1615 Cys Ser Ser Ser Leu Val AlaLeu His Leu Ala Val Gln Ala Leu Arg 1620 1625 1630 Lys Gly Glu Val AspMet Ala Leu Ala Gly Gly Val Ala Val Met Pro 1635 1640 1645 Thr Pro GlyMet Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Gly 1650 1655 1660 AspGly Arg Ser Lys Ala Phe Ala Ala Ser Ala Asp Gly Thr Ser Trp 1665 16701675 1680 Ser Glu Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp AlaArg 1685 1690 1695 Arg Asn Gly His Gln Val Leu Ala Val Val Arg Gly SerAla Leu Asn 1700 1705 1710 Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala ProAsn Gly Pro Ser Gln 1715 1720 1725 Gln Arg Val Ile Arg Arg Ala Leu AlaAsp Ala Arg Leu Thr Thr Ser 1730 1735 1740 Asp Val Asp Val Val Glu AlaHis Gly Thr Gly Thr Arg Leu Gly Asp 1745 1750 1755 1760 Pro Ile Glu AlaGln Ala Leu Ile Ala Thr Tyr Gly Gln Gly Arg Asp 1765 1770 1775 Asp GluGln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His 1780 1785 1790Thr Gln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val Gln Ala 17951800 1805 Met Arg His Gly Leu Leu Pro Lys Thr Leu His Val Asp Glu ProSer 1810 1815 1820 Asp Gln Ile Asp Trp Ser Ala Gly Ala Val Glu Leu LeuThr Glu Ala 1825 1830 1835 1840 Val Asp Trp Pro Glu Lys Gln Asp Gly GlyLeu Arg Arg Ala Ala Val 1845 1850 1855 Ser Ser Phe Gly Ile Ser Gly ThrAsn Ala His Val Val Leu Glu Glu 1860 1865 1870 Ala Pro Val Val Val GluGly Ala Ser Val Val Glu Pro Ser Val Gly 1875 1880 1885 Gly Ser Ala ValGly Gly Gly Val Thr Pro Trp Val Val Ser Ala Lys 1890 1895 1900 Ser AlaAla Ala Leu Asp Ala Gln Ile Glu Arg Leu Ala Ala Phe Ala 1905 1910 19151920 Ser Arg Asp Arg Thr Asp Asp Ala Asp Ala Gly Ala Val Asp Ala Gly1925 1930 1935 Ala Val Ala His Val Leu Ala Asp Gly Arg Ala Gln Phe GluHis Arg 1940 1945 1950 Ala Val Ala Leu Gly Ala Gly Ala Asp Asp Leu ValGln Ala Leu Ala 1955 1960 1965 Asp Pro Asp Gly Leu Ile Arg Gly Thr AlaSer Gly Val Gly Arg Val 1970 1975 1980 Ala Phe Val Phe Pro Gly Gln GlyThr Gln Trp Ala Gly Met Gly Ala 1985 1990 1995 2000 Glu Leu Leu Asp SerSer Ala Val Phe Ala Ala Ala Met Ala Glu Cys 2005 2010 2015 Glu Ala AlaLeu Ser Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val 2020 2025 2030 ArgGln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val Val Gln 2035 20402045 Pro Val Thr Phe Ala Val Met Val Ser Leu Ala Arg Val Trp Gln His2050 2055 2060 His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln GlyGlu Ile 2065 2070 2075 2080 Ala Ala Ala Tyr Val Ala Gly Ala Leu Pro LeuAsp Asp Ala Ala Arg 2085 2090 2095 Val Val Thr Leu Arg Ser Lys Ser IleAla Ala His Leu Ala Gly Lys 2100 2105 2110 Gly Gly Met Leu Ser Leu AlaLeu Asn Glu Asp Ala Val Leu Glu Arg 2115 2120 2125 Leu Ser Asp Phe AspGly Leu Ser Val Ala Ala Val Asn Gly Pro Thr 2130 2135 2140 Ala Thr ValVal Ser Gly Asp Pro Val Gln Ile Glu Glu Leu Ala Gln 2145 2150 2155 2160Ala Cys Lys Ala Asp Gly Phe Arg Ala Arg Ile Ile Pro Val Asp Tyr 21652170 2175 Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Ser Glu Leu AlaGln 2180 2185 2190 Val Leu Ala Gly Leu Ser Pro Gln Ala Pro Arg Val ProPhe Phe Ser 2195 2200 2205 Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro ValLeu Asp Gly Thr Tyr 2210 2215 2220 Trp Tyr Arg Asn Leu Arg His Arg ValGly Phe Ala Pro Ala Ile Glu 2225 2230 2235 2240 Thr Leu Ala Val Asp GluGly Phe Thr His Phe Val Glu Val Ser Ala 2245 2250 2255 His Pro Val LeuThr Met Thr Leu Pro Glu Thr Val Thr Gly Leu Gly 2260 2265 2270 Thr LeuArg Arg Glu Gln Gly Gly Gln Glu Arg Leu Val Thr Ser Leu 2275 2280 2285Ala Glu Ala Trp Val Asn Gly Leu Pro Val Ala Trp Thr Ser Leu Leu 22902295 2300 Pro Ala Thr Ala Ser Arg Pro Gly Leu Pro Thr Tyr Ala Phe GlnAla 2305 2310 2315 2320 Glu Arg Tyr Trp Leu Glu Asn Thr Pro Ala Ala LeuAla Thr Gly Asp 2325 2330 2335 Asp Trp Arg Tyr Arg Ile Asp Trp Lys ArgLeu Pro Ala Ala Glu Gly 2340 2345 2350 Ser Glu Arg Thr Gly Leu Ser GlyArg Trp Leu Ala Val Thr Pro Glu 2355 2360 2365 Asp His Ser Ala Gln AlaAla Ala Val Leu Thr Ala Leu Val Asp Ala 2370 2375 2380 Gly Ala Lys ValGlu Val Leu Thr Ala Gly Ala Asp Asp Asp Arg Glu 2385 2390 2395 2400 AlaLeu Ala Ala Arg Leu Thr Ala Leu Thr Thr Gly Asp Gly Phe Thr 2405 24102415 Gly Val Val Ser Leu Leu Asp Gly Leu Val Pro Gln Val Ala Trp Val2420 2425 2430 Gln Ala Leu Gly Asp Ala Gly Ile Lys Ala Pro Leu Trp SerVal Thr 2435 2440 2445 Gln Gly Ala Val Ser Val Gly Arg Leu Asp Thr ProAla Asp Pro Asp 2450 2455 2460 Arg Ala Met Leu Trp Gly Leu Gly Arg ValVal Ala Leu Glu His Pro 2465 2470 2475 2480 Glu Arg Trp Ala Gly Leu ValAsp Leu Pro Ala Gln Pro Asp Ala Ala 2485 2490 2495 Ala Leu Ala His LeuVal Thr Ala Leu Ser Gly Ala Thr Gly Glu Asp 2500 2505 2510 Gln Ile AlaIle Arg Thr Thr Gly Leu His Ala Arg Arg Leu Ala Arg 2515 2520 2525 AlaPro Leu His Gly Arg Arg Pro Thr Arg Asp Trp Gln Pro His Gly 2530 25352540 Thr Val Leu Ile Thr Gly Gly Thr Gly Ala Leu Gly Ser His Ala Ala2545 2550 2555 2560 Arg Trp Met Ala His His Gly Ala Glu His Leu Leu LeuVal Ser Arg 2565 2570 2575 Ser Gly Glu Gln Ala Pro Gly Ala Thr Gln LeuThr Ala Glu Leu Thr 2580 2585 2590 Ala Ser Gly Ala Arg Val Thr Ile AlaAla Cys Asp Val Ala Asp Pro 2595 2600 2605 His Ala Met Arg Thr Leu LeuAsp Ala Ile Pro Ala Glu Thr Pro Leu 2610 2615 2620 Thr Ala Val Val HisThr Ala Gly Ala Leu Asp Asp Gly Ile Val Asp 2625 2630 2635 2640 Thr LeuThr Ala Glu Gln Val Arg Arg Ala His Arg Ala Lys Ala Val 2645 2650 2655Gly Ala Ser Val Leu Asp Glu Leu Thr Arg Asp Leu Asp Leu Asp Ala 26602665 2670 Phe Val Leu Phe Ser Ser Val Ser Ser Thr Leu Gly Ile Pro GlyGln 2675 2680 2685 Gly Asn Tyr Ala Pro His Asn Ala Tyr Leu Asp Ala LeuAla Ala Arg 2690 2695 2700 Arg Arg Ala Thr Gly Arg Ser Ala Val Ser ValAla Trp Gly Pro Trp 2705 2710 2715 2720 Asp Gly Gly Gly Met Ala Ala GlyAsp Gly Val Ala Glu Arg Leu Arg 2725 2730 2735 Asn His Gly Val Pro GlyMet Asp Pro Glu Leu Ala Leu Ala Ala Leu 2740 2745 2750 Glu Ser Ala LeuGly Arg Asp Glu Thr Ala Ile Thr Val Ala Asp Ile 2755 2760 2765 Asp TrpAsp Arg Phe Tyr Leu Ala Tyr Ser Ser Gly Arg Pro Gln Pro 2770 2775 2780Leu Val Glu Glu Leu Pro Glu Val Arg Arg Ile Ile Asp Ala Arg Asp 27852790 2795 2800 Ser Ala Thr Ser Gly Gln Gly Gly Ser Ser Ala Gln Gly AlaAsn Pro 2805 2810 2815 Leu Ala Glu Arg Leu Ala Ala Ala Ala Pro Gly GluArg Thr Glu Ile 2820 2825 2830 Leu Leu Gly Leu Val Arg Ala Gln Ala AlaAla Val Leu Arg Met Arg 2835 2840 2845 Ser Pro Glu Asp Val Ala Ala AspArg Ala Phe Lys Asp Ile Gly Phe 2850 2855 2860 Asp Ser Leu Ala Gly ValGlu Leu Arg Asn Arg Leu Thr Arg Ala Thr 2865 2870 2875 2880 Gly Leu GlnLeu Pro Ala Thr Leu Val Phe Asp His Pro Thr Pro Leu 2885 2890 2895 AlaLeu Val Ser Leu Leu Arg Ser Glu Phe Leu Gly Asp Glu Glu Thr 2900 29052910 Ala Asp Ala Arg Arg Ser Ala Ala Leu Pro Ala Thr Val Gly Ala Gly2915 2920 2925 Ala Gly Ala Gly Ala Gly Thr Asp Ala Asp Asp Asp Pro IleAla Ile 2930 2935 2940 Val Ala Met Ser Cys Arg Tyr Pro Gly Asp Ile ArgSer Pro Glu Asp 2945 2950 2955 2960 Leu Trp Arg Met Leu Ser Glu Gly GlyGlu Gly Ile Thr Pro Phe Pro 2965 2970 2975 Thr Asp Arg Gly Trp Asp LeuAsp Gly Leu Tyr Asp Ala Asp Pro Asp 2980 2985 2990 Ala Leu Gly Arg AlaTyr Val Arg Glu Gly Gly Phe Leu His Asp Ala 2995 3000 3005 Ala Glu PheAsp Ala Glu Phe Phe Gly Val Ser Pro Arg Glu Ala Leu 3010 3015 3020 AlaMet Asp Pro Gln Gln Arg Met Leu Leu Thr Thr Ser Trp Glu Ala 3025 30303035 3040 Phe Glu Arg Ala Gly Ile Glu Pro Ala Ser Leu Arg Gly Ser SerThr 3045 3050 3055 Gly Val Phe Ile Gly Leu Ser Tyr Gln Asp Tyr Ala AlaArg Val Pro 3060 3065 3070 Asn Ala Pro Arg Gly Val Glu Gly Tyr Leu LeuThr Gly Ser Thr Pro 3075 3080 3085 Ser Val Ala Ser Gly Arg Ile Ala TyrThr Phe Gly Leu Glu Gly Pro 3090 3095 3100 Ala Thr Thr Val Asp Thr AlaCys Ser Ser Ser Leu Thr Ala Leu His 3105 3110 3115 3120 Leu Ala Val ArgAla Leu Arg Ser Gly Glu Cys Thr Met Ala Leu Ala 3125 3130 3135 Gly GlyVal Ala Met Met Ala Thr Pro His Met Phe Val Glu Phe Ser 3140 3145 3150Arg Gln Arg Ala Leu Ala Pro Asp Gly Arg Ser Lys Ala Phe Ser Ala 31553160 3165 Asp Ala Asp Gly Phe Gly Ala Ala Glu Gly Val Gly Leu Leu LeuVal 3170 3175 3180 Glu Arg Leu Ser Asp Ala Arg Arg Asn Gly His Pro ValLeu Ala Val 3185 3190 3195 3200 Val Arg Gly Thr Ala Val Asn Gln Asp GlyAla Ser Asn Gly Leu Thr 3205 3210 3215 Ala Pro Asn Gly Pro Ser Gln GlnArg Val Ile Arg Gln Ala Leu Ala 3220 3225 3230 Asp Ala Arg Leu Ala ProGly Asp Ile Asp Ala Val Glu Thr His Gly 3235 3240 3245 Thr Gly Thr SerLeu Gly Asp Pro Ile Glu Ala Gln Gly Leu Gln Ala 3250 3255 3260 Thr TyrGly Lys Glu Arg Pro Ala Glu Arg Pro Leu Ala Ile Gly Ser 3265 3270 32753280 Val Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Ala Ala Gly3285 3290 3295 Ile Ile Lys Met Val Leu Ala Met Arg His Gly Thr Leu ProLys Thr 3300 3305 3310 Leu His Ala Asp Glu Pro Ser Pro His Val Asp TrpAla Asn Ser Gly 3315 3320 3325 Leu Ala Leu Val Thr Glu Pro Ile Asp TrpPro Ala Gly Thr Gly Pro 3330 3335 3340 Arg Arg Ala Ala Val Ser Ser PheGly Ile Ser Gly Thr Asn Ala His 3345 3350 3355 3360 Val Val Leu Glu GlnAla Pro Asp Ala Ala Gly Glu Val Leu Gly Ala 3365 3370 3375 Asp Glu ValPro Glu Val Ser Glu Thr Val Ala Met Ala Gly Thr Ala 3380 3385 3390 GlyThr Ser Glu Val Ala Glu Gly Ser Glu Ala Ser Glu Ala Pro Ala 3395 34003405 Ala Pro Gly Ser Arg Glu Ala Ser Leu Pro Gly His Leu Pro Trp Val3410 3415 3420 Leu Ser Ala Lys Asp Glu Gln Ser Leu Arg Gly Gln Ala AlaAla Leu 3425 3430 3435 3440 His Ala Trp Leu Ser Glu Pro Ala Ala Asp LeuSer Asp Ala Asp Gly 3445 3450 3455 Pro Ala Arg Leu Arg Asp Val Gly TyrThr Leu Ala Thr Ser Arg Thr 3460 3465 3470 Ala Phe Ala His Arg Ala AlaVal Thr Ala Ala Asp Arg Asp Gly Phe 3475 3480 3485 Leu Asp Gly Leu AlaThr Leu Ala Gln Gly Gly Thr Ser Ala His Val 3490 3495 3500 His Leu AspThr Ala Arg Asp Gly Thr Thr Ala Phe Leu Phe Thr Gly 3505 3510 3515 3520Gln Gly Ser Gln Arg Pro Gly Ala Gly Arg Glu Leu Tyr Asp Arg His 35253530 3535 Pro Val Phe Ala Arg Ala Leu Asp Glu Ile Cys Ala His Leu AspGly 3540 3545 3550 His Leu Glu Leu Pro Leu Leu Asp Val Met Phe Ala AlaGlu Gly Ser 3555 3560 3565 Ala Glu Ala Ala Leu Leu Asp Glu Thr Arg TyrThr Gln Cys Ala Leu 3570 3575 3580 Phe Ala Leu Glu Val Ala Leu Phe ArgLeu Val Glu Ser Trp Gly Met 3585 3590 3595 3600 Arg Pro Ala Ala Leu LeuGly His Ser Val Gly Glu Ile Ala Ala Ala 3605 3610 3615 His Val Ala GlyVal Phe Ser Leu Ala Asp Ala Ala Arg Leu Val Ala 3620 3625 3630 Ala ArgGly Arg Leu Met Gln Glu Leu Pro Ala Gly Gly Ala Met Leu 3635 3640 3645Ala Val Gln Ala Ala Glu Asp Glu Ile Arg Val Trp Leu Glu Thr Glu 36503655 3660 Glu Arg Tyr Ala Gly Arg Leu Asp Val Ala Ala Val Asn Gly ProGlu 3665 3670 3675 3680 Ala Ala Val Leu Ser Gly Asp Ala Asp Ala Ala ArgGlu Ala Glu Ala 3685 3690 3695 Tyr Trp Ser Gly Leu Gly Arg Arg Thr ArgAla Leu Arg Val Ser His 3700 3705 3710 Ala Phe His Ser Ala His Met AspGly Met Leu Asp Gly Phe Arg Ala 3715 3720 3725 Val Leu Glu Thr Val GluPhe Arg Arg Pro Ser Leu Thr Val Val Ser 3730 3735 3740 Asn Val Thr GlyLeu Ala Ala Gly Pro Asp Asp Leu Cys Asp Pro Glu 3745 3750 3755 3760 TyrTrp Val Arg His Val Arg Gly Thr Val Arg Phe Leu Asp Gly Val 3765 37703775 Arg Val Leu Arg Asp Leu Gly Val Arg Thr Cys Leu Glu Leu Gly Pro3780 3785 3790 Asp Gly Val Leu Thr Ala Met Ala Ala Asp Gly Leu Ala AspThr Pro 3795 3800 3805 Ala Asp Ser Ala Ala Gly Ser Pro Val Gly Ser ProAla Gly Ser Pro 3810 3815 3820 Ala Asp Ser Ala Ala Gly Ala Leu Arg ProArg Pro Leu Leu Val Ala 3825 3830 3835 3840 Leu Leu Arg Arg Lys Arg SerGlu Thr Glu Thr Val Ala Asp Ala Leu 3845 3850 3855 Gly Arg Ala His AlaHis Gly Thr Gly Pro Asp Trp His Ala Trp Phe 3860 3865 3870 Ala Gly SerGly Ala His Arg Val Asp Leu Pro Thr Tyr Ser Phe Arg 3875 3880 3885 ArgAsp Arg Tyr Trp Leu Asp Ala Pro Ala Ala Asp Thr Ala Val Asp 3890 38953900 Thr Ala Gly Leu Gly Leu Gly Thr Ala Asp His Pro Leu Leu Gly Ala3905 3910 3915 3920 Val Val Ser Leu Pro Asp Arg Asp Gly Leu Leu Leu ThrGly Arg Leu 3925 3930 3935 Ser Leu Arg Thr His Pro Trp Leu Ala Asp HisAla Val Leu Gly Ser 3940 3945 3950 Val Leu Leu Pro Gly Ala Ala Met ValGlu Leu Ala Ala His Ala Ala 3955 3960 3965 Glu Ser Ala Gly Leu Arg AspVal Arg Glu Leu Thr Leu Leu Glu Pro 3970 3975 3980 Leu Val Leu Pro GluHis Gly Gly Val Glu Leu Arg Val Thr Val Gly 3985 3990 3995 4000 Ala ProAla Gly Glu Pro Gly Gly Glu Ser Ala Gly Asp Gly Ala Arg 4005 4010 4015Pro Val Ser Leu His Ser Arg Leu Ala Asp Ala Pro Ala Gly Thr Ala 40204025 4030 Trp Ser Cys His Ala Thr Gly Leu Leu Ala Thr Asp Arg Pro GluLeu 4035 4040 4045 Pro Val Ala Pro Asp Arg Ala Ala Met Trp Pro Pro GlnGly Ala Glu 4050 4055 4060 Glu Val Pro Leu Asp Gly Leu Tyr Glu Arg LeuAsp Gly Asn Gly Leu 4065 4070 4075 4080 Ala Phe Gly Pro Leu Phe Gln GlyLeu Asn Ala Val Trp Arg Tyr Glu 4085 4090 4095 Gly Glu Val Phe Ala AspIle Ala Leu Pro Ala Thr Thr Asn Ala Thr 4100 4105 4110 Ala Pro Ala ThrAla Asn Gly Gly Gly Ser Ala Ala Ala Ala Pro Tyr 4115 4120 4125 Gly IleHis Pro Ala Leu Leu Asp Ala Ser Leu His Ala Ile Ala Val 4130 4135 4140Gly Gly Leu Val Asp Glu Pro Glu Leu Val Arg Val Pro Phe His Trp 41454150 4155 4160 Ser Gly Val Thr Val His Ala Ala Gly Ala Ala Ala Ala ArgVal Arg 4165 4170 4175 Leu Ala Ser Ala Gly Thr Asp Ala Val Ser Leu SerLeu Thr Asp Gly 4180 4185 4190 Glu Gly Arg Pro Leu Val Ser Val Glu ArgLeu Thr Leu Arg Pro Val 4195 4200 4205 Thr Ala Asp Gln Ala Ala Ala SerArg Val Gly Gly Leu Met His Arg 4210 4215 4220 Val Ala Trp Arg Pro TyrAla Leu Ala Ser Ser Gly Glu Gln Asp Pro 4225 4230 4235 4240 His Ala ThrSer Tyr Gly Pro Thr Ala Val Leu Gly Lys Asp Glu Leu 4245 4250 4255 LysVal Ala Ala Ala Leu Glu Ser Ala Gly Val Glu Val Gly Leu Tyr 4260 42654270 Pro Asp Leu Ala Ala Leu Ser Gln Asp Val Ala Ala Gly Ala Pro Ala4275 4280 4285 Pro Arg Thr Val Leu Ala Pro Leu Pro Ala Gly Pro Ala AspGly Gly 4290 4295 4300 Ala Glu Gly Val Arg Gly Thr Val Ala Arg Thr LeuGlu Leu Leu Gln 4305 4310 4315 4320 Ala Trp Leu Ala Asp Glu His Leu AlaGly Thr Arg Leu Leu Leu Val 4325 4330 4335 Thr Arg Gly Ala Val Arg AspPro Glu Gly Ser Gly Ala Asp Asp Gly 4340 4345 4350 Gly Glu Asp Leu SerHis Ala Ala Ala Trp Gly Leu Val Arg Thr Ala 4355 4360 4365 Gln Thr GluAsn Pro Gly Arg Phe Gly Leu Leu Asp Leu Ala Asp Asp 4370 4375 4380 AlaSer Ser Tyr Arg Thr Leu Pro Ser Val Leu Ser Asp Ala Gly Leu 4385 43904395 4400 Arg Asp Glu Pro Gln Leu Ala Leu His Asp Gly Thr Ile Arg LeuAla 4405 4410 4415 Arg Leu Ala Ser Val Arg Pro Glu Thr Gly Thr Ala AlaPro Ala Leu 4420 4425 4430 Ala Pro Glu Gly Thr Val Leu Leu Thr Gly GlyThr Gly Gly Leu Gly 4435 4440 4445 Gly Leu Val Ala Arg His Val Val GlyGlu Trp Gly Val Arg Arg Leu 4450 4455 4460 Leu Leu Val Ser Arg Arg GlyThr Asp Ala Pro Gly Ala Asp Glu Leu 4465 4470 4475 4480 Val His Glu LeuGlu Ala Leu Gly Ala Asp Val Ser Val Ala Ala Cys 4485 4490 4495 Asp ValAla Asp Arg Glu Ala Leu Thr Ala Val Leu Asp Ala Ile Pro 4500 4505 4510Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu Ser 45154520 4525 Asp Gly Thr Leu Pro Ser Met Thr Thr Glu Asp Val Glu His ValLeu 4530 4535 4540 Arg Pro Lys Val Asp Ala Ala Phe Leu Leu Asp Glu LeuThr Ser Thr 4545 4550 4555 4560 Pro Ala Tyr Asp Leu Ala Ala Phe Val MetPhe Ser Ser Ala Ala Ala 4565 4570 4575 Val Phe Gly Gly Ala Gly Gln GlyAla Tyr Ala Ala Ala Asn Ala Thr 4580 4585 4590 Leu Asp Ala Leu Ala TrpArg Arg Arg Ala Ala Gly Leu Pro Ala Leu 4595 4600 4605 Ser Leu Gly TrpGly Leu Trp Ala Glu Thr Ser Gly Met Thr Gly Glu 4610 4615 4620 Leu GlyGln Ala Asp Leu Arg Arg Met Ser Arg Ala Gly Ile Gly Gly 4625 4630 46354640 Ile Ser Asp Ala Glu Gly Ile Ala Leu Leu Asp Ala Ala Leu Arg Asp4645 4650 4655 Asp Arg His Pro Val Leu Leu Pro Leu Arg Leu Asp Ala AlaGly Leu 4660 4665 4670 Arg Asp Ala Ala Gly Asn Asp Pro Ala Gly Ile ProAla Leu Phe Arg 4675 4680 4685 Asp Val Val Gly Ala Arg Thr Val Arg AlaArg Pro Ser Ala Ala Ser 4690 4695 4700 Ala Ser Thr Thr Ala Gly Thr AlaGly Thr Pro Gly Thr Ala Asp Gly 4705 4710 4715 4720 Ala Ala Glu Thr AlaAla Val Thr Leu Ala Asp Arg Ala Ala Thr Val 4725 4730 4735 Asp Gly ProAla Arg Gln Arg Leu Leu Leu Glu Phe Val Val Gly Glu 4740 4745 4750 ValAla Glu Val Leu Gly His Ala Arg Gly His Arg Ile Asp Ala Glu 4755 47604765 Arg Gly Phe Leu Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu Leu4770 4775 4780 Arg Asn Arg Leu Asn Ser Ala Gly Gly Leu Ala Leu Pro AlaThr Leu 4785 4790 4795 4800 Val Phe Asp His Pro Ser Pro Ala Ala Leu AlaSer His Leu Asp Ala 4805 4810 4815 Glu Leu Pro Arg Gly Ala Ser Asp GlnAsp Gly Ala Gly Asn Arg Asn 4820 4825 4830 Gly Asn Glu Asn Gly Thr ThrAla Ser Arg Ser Thr Ala Glu Thr Asp 4835 4840 4845 Ala Leu Leu Ala GlnLeu Thr Arg Leu Glu Gly Ala Leu Val Leu Thr 4850 4855 4860 Gly Leu SerAsp Ala Pro Gly Ser Glu Glu Val Leu Glu His Leu Arg 4865 4870 4875 4880Ser Leu Arg Ser Met Val Thr Gly Glu Thr Gly Thr Gly Thr Ala Ser 48854890 4895 Gly Ala Pro Asp Gly Ala Gly Ser Gly Ala Glu Asp Arg Pro TrpAla 4900 4905 4910 Ala Gly Asp Gly Ala Gly Gly Gly Ser Glu Asp Gly AlaGly Val Pro 4915 4920 4925 Asp Phe Met Asn Ala Ser Ala Glu Glu Leu PheGly Leu Leu Asp Gln 4930 4935 4940 Asp Pro Ser Thr Asp Met Ser Thr ValAsn Glu Glu Lys Tyr Leu Asp 4945 4950 4955 4960 Tyr Leu Arg Arg Ala ThrAla Asp Leu His Glu Ala Arg Gly Arg Leu 4965 4970 4975 Arg Glu Leu GluAla Lys Ala Gly Glu Pro Val Ala Ile Val Gly Met 4980 4985 4990 Ala CysArg Leu Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg 4995 5000 5005Leu Val Ala Gly Gly Glu Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg 50105015 5020 Gly Trp Asp Val Glu Gly Leu Tyr Asp Pro Asn Pro Glu Ala ThrGly 5025 5030 5035 5040 Lys Ser Tyr Ala Arg Glu Ala Gly Phe Leu Tyr GluAla Gly Glu Phe 5045 5050 5055 Asp Ala Asp Phe Phe Gly Ile Ser Pro ArgGlu Ala Leu Ala Met Asp 5060 5065 5070 Pro Gln Gln Arg Leu Leu Leu GluAla Ser Trp Glu Ala Phe Glu His 5075 5080 5085 Ala Gly Ile Pro Ala AlaThr Ala Arg Gly Thr Ser Val Gly Val Phe 5090 5095 5100 Thr Gly Val MetTyr His Asp Tyr Ala Thr Arg Leu Thr Asp Val Pro 5105 5110 5115 5120 GluGly Ile Glu Gly Tyr Leu Gly Thr Gly Asn Ser Gly Ser Val Ala 5125 51305135 Ser Gly Arg Val Ala Tyr Thr Leu Gly Leu Glu Gly Pro Ala Val Thr5140 5145 5150 Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His LeuAla Val 5155 5160 5165 Gln Ala Leu Arg Lys Gly Glu Val Asp Met Ala LeuAla Gly Gly Val 5170 5175 5180 Thr Val Met Ser Thr Pro Ser Thr Phe ValGlu Phe Ser Arg Gln Arg 5185 5190 5195 5200 Gly Leu Ala Pro Asp Gly ArgSer Lys Ser Phe Ser Ser Thr Ala Asp 5205 5210 5215 Gly Thr Ser Trp SerGlu Gly Val Gly Val Leu Leu Val Glu Arg Leu 5220 5225 5230 Ser Asp AlaArg Arg Lys Gly His Arg Ile Leu Ala Val Val Arg Gly 5235 5240 5245 ThrAla Val Asn Gln Asp Gly Ala Ser Ser Gly Leu Thr Ala Pro Asn 5250 52555260 Gly Pro Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg5265 5270 5275 5280 Leu Thr Thr Ser Asp Val Asp Val Val Glu Ala His GlyThr Gly Thr 5285 5290 5295 Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala ValIle Ala Thr Tyr Gly 5300 5305 5310 Gln Gly Arg Asp Gly Glu Gln Pro LeuArg Leu Gly Ser Leu Lys Ser 5315 5320 5325 Asn Ile Gly His Thr Gln AlaAla Ala Gly Val Ser Gly Val Ile Lys 5330 5335 5340 Met Val Gln Ala MetArg His Gly Val Leu Pro Lys Thr Leu His Val 5345 5350 5355 5360 Glu LysPro Thr Asp Gln Val Asp Trp Ser Ala Gly Ala Val Glu Leu 5365 5370 5375Leu Thr Glu Ala Met Asp Trp Pro Asp Lys Gly Asp Gly Gly Leu Arg 53805385 5390 Arg Ala Ala Val Ser Ser Phe Gly Val Ser Gly Thr Asn Ala HisVal 5395 5400 5405 Val Leu Glu Glu Ala Pro Ala Ala Glu Glu Thr Pro AlaSer Glu Ala 5410 5415 5420 Thr Pro Ala Val Glu Pro Ser Val Gly Ala GlyLeu Val Pro Trp Leu 5425 5430 5435 5440 Val Ser Ala Lys Thr Pro Ala AlaLeu Asp Ala Gln Ile Gly Arg Leu 5445 5450 5455 Ala Ala Phe Ala Ser GlnGly Arg Thr Asp Ala Ala Asp Pro Gly Ala 5460 5465 5470 Val Ala Arg ValLeu Ala Gly Gly Arg Ala Glu Phe Glu His Arg Ala 5475 5480 5485 Val ValLeu Gly Thr Gly Gln Asp Asp Phe Ala Gln Ala Leu Thr Ala 5490 5495 5500Pro Glu Gly Leu Ile Arg Gly Thr Pro Ser Asp Val Gly Arg Val Ala 55055510 5515 5520 Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met GlyAla Glu 5525 5530 5535 Leu Leu Asp Val Ser Lys Glu Phe Ala Ala Ala MetAla Glu Cys Glu 5540 5545 5550 Ser Ala Leu Ser Arg Tyr Val Asp Trp SerLeu Glu Ala Val Val Arg 5555 5560 5565 Gln Ala Pro Gly Ala Pro Thr LeuGlu Arg Val Asp Val Val Gln Pro 5570 5575 5580 Val Thr Phe Ala Val MetVal Ser Leu Ala Lys Val Trp Gln His His 5585 5590 5595 5600 Gly Val ThrPro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala 5605 5610 5615 AlaAla Tyr Val Ala Gly Ala Leu Thr Leu Asp Asp Ala Ala Arg Val 5620 56255630 Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly5635 5640 5645 Gly Met Ile Ser Leu Ala Leu Ser Glu Glu Ala Thr Arg GlnArg Ile 5650 5655 5660 Glu Asn Leu His Gly Leu Ser Ile Ala Ala Val AsnGly Pro Thr Ala 5665 5670 5675 5680 Thr Val Val Ser Gly Asp Pro Thr GlnIle Gln Glu Leu Ala Gln Ala 5685 5690 5695 Cys Glu Ala Asp Gly Val ArgAla Arg Ile Ile Pro Val Asp Tyr Ala 5700 5705 5710 Ser His Ser Ala HisVal Glu Thr Ile Glu Ser Glu Leu Ala Glu Val 5715 5720 5725 Leu Ala GlyLeu Ser Pro Arg Thr Pro Glu Val Pro Phe Phe Ser Thr 5730 5735 5740 LeuGlu Gly Ala Trp Ile Thr Glu Pro Val Leu Asp Gly Thr Tyr Trp 5745 57505755 5760 Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala Val GluThr 5765 5770 5775 Leu Ala Thr Asp Glu Gly Phe Thr His Phe Ile Glu ValSer Ala His 5780 5785 5790 Pro Val Leu Thr Met Thr Leu Pro Glu Thr ValThr Gly Leu Gly Thr 5795 5800 5805 Leu Arg Arg Glu Gln Gly Gly Gln GluArg Leu Val Thr Ser Leu Ala 5810 5815 5820 Glu Ala Trp Thr Asn Gly LeuThr Ile Asp Trp Ala Pro Val Leu Pro 5825 5830 5835 5840 Thr Ala Thr GlyHis His Pro Glu Leu Pro Thr Tyr Ala Phe Gln Arg 5845 5850 5855 Arg HisTyr Trp Leu His Asp Ser Pro Ala Val Gln Gly Ser Val Gln 5860 5865 5870Asp Ser Trp Arg Tyr Arg Ile Asp Trp Lys Arg Leu Ala Val Ala Asp 58755880 5885 Ala Ser Glu Arg Ala Gly Leu Ser Gly Arg Trp Leu Val Val ValPro 5890 5895 5900 Glu Asp Arg Ser Ala Glu Ala Ala Pro Val Leu Ala AlaLeu Ser Gly 5905 5910 5915 5920 Ala Gly Ala Asp Pro Val Gln Leu Asp ValSer Pro Leu Gly Asp Arg 5925 5930 5935 Gln Arg Leu Ala Ala Thr Leu GlyGlu Ala Leu Ala Ala Ala Gly Gly 5940 5945 5950 Ala Val Asp Gly Val LeuSer Leu Leu Ala Trp Asp Glu Ser Ala His 5955 5960 5965 Pro Gly His ProAla Pro Phe Thr Arg Gly Thr Gly Ala Thr Leu Thr 5970 5975 5980 Leu ValGln Ala Leu Glu Asp Ala Gly Val Ala Ala Pro Leu Trp Cys 5985 5990 59956000 Val Thr His Gly Ala Val Ser Val Gly Arg Ala Asp His Val Thr Ser6005 6010 6015 Pro Ala Gln Ala Met Val Trp Gly Met Gly Arg Val Ala AlaLeu Glu 6020 6025 6030 His Pro Glu Arg Trp Gly Gly Leu Ile Asp Leu ProSer Asp Ala Asp 6035 6040 6045 Arg Ala Ala Leu Asp Arg Met Thr Thr ValLeu Ala Gly Gly Thr Gly 6050 6055 6060 Glu Asp Gln Val Ala Val Arg AlaSer Gly Leu Leu Ala Arg Arg Leu 6065 6070 6075 6080 Val Arg Ala Ser LeuPro Ala His Gly Thr Ala Ser Pro Trp Trp Gln 6085 6090 6095 Ala Asp GlyThr Val Leu Val Thr Gly Ala Glu Glu Pro Ala Ala Ala 6100 6105 6110 GluAla Ala Arg Arg Leu Ala Arg Asp Gly Ala Gly His Leu Leu Leu 6115 61206125 His Thr Thr Pro Ser Gly Ser Glu Gly Ala Glu Gly Thr Ser Gly Ala6130 6135 6140 Ala Glu Asp Ser Gly Leu Ala Gly Leu Val Ala Glu Leu AlaAsp Leu 6145 6150 6155 6160 Gly Ala Thr Ala Thr Val Val Thr Cys Asp LeuThr Asp Ala Glu Ala 6165 6170 6175 Ala Ala Arg Leu Leu Ala Gly Val SerAsp Ala His Pro Leu Ser Ala 6180 6185 6190 Val Leu His Leu Pro Pro ThrVal Asp Ser Glu Pro Leu Ala Ala Thr 6195 6200 6205 Asp Ala Asp Ala LeuAla Arg Val Val Thr Ala Lys Ala Thr Ala Ala 6210 6215 6220 Leu His LeuAsp Arg Leu Leu Arg Glu Ala Ala Ala Ala Gly Gly Arg 6225 6230 6235 6240Pro Pro Val Leu Val Leu Phe Ser Ser Val Ala Ala Ile Trp Gly Gly 62456250 6255 Ala Gly Gln Gly Ala Tyr Ala Ala Gly Thr Ala Phe Leu Asp AlaLeu 6260 6265 6270 Ala Gly Gln His Arg Ala Asp Gly Pro Thr Val Thr SerVal Ala Trp 6275 6280 6285 Ser Pro Trp Glu Gly Ser Arg Val Thr Glu GlyAla Thr Gly Glu Arg 6290 6295 6300 Leu Arg Arg Leu Gly Leu Arg Pro LeuAla Pro Ala Thr Ala Leu Thr 6305 6310 6315 6320 Ala Leu Asp Thr Ala LeuGly His Gly Asp Thr Ala Val Thr Ile Ala 6325 6330 6335 Asp Val Asp TrpSer Ser Phe Ala Pro Gly Phe Thr Thr Ala Arg Pro 6340 6345 6350 Gly ThrLeu Leu Ala Asp Leu Pro Glu Ala Arg Arg Ala Leu Asp Glu 6355 6360 6365Gln Gln Ser Thr Thr Ala Ala Asp Asp Thr Val Leu Ser Arg Glu Leu 63706375 6380 Gly Ala Leu Thr Gly Ala Glu Gln Gln Arg Arg Met Gln Glu LeuVal 6385 6390 6395 6400 Arg Glu His Leu Ala Val Val Leu Asn His Pro SerPro Glu Ala Val 6405 6410 6415 Asp Thr Gly Arg Ala Phe Arg Asp Leu GlyPhe Asp Ser Leu Thr Ala 6420 6425 6430 Val Glu Leu Arg Asn Arg Leu LysAsn Ala Thr Gly Leu Ala Leu Pro 6435 6440 6445 Ala Thr Leu Val Phe AspTyr Pro Thr Pro Arg Thr Leu Ala Glu Phe 6450 6455 6460 Leu Leu Ala GluIle Leu Gly Glu Gln Ala Gly Ala Gly Glu Gln Leu 6465 6470 6475 6480 ProVal Asp Gly Gly Val Asp Asp Glu Pro Val Ala Ile Val Gly Met 6485 64906495 Ala Cys Arg Leu Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg6500 6505 6510 Leu Val Ala Gly Gly Glu Asp Ala Ile Ser Gly Phe Pro GlnAsp Arg 6515 6520 6525 Gly Trp Asp Val Glu Gly Leu Tyr Asp Pro Asp ProAsp Ala Ser Gly 6530 6535 6540 Arg Thr Tyr Cys Arg Ala Gly Gly Phe LeuAsp Glu Ala Gly Glu Phe 6545 6550 6555 6560 Asp Ala Asp Phe Phe Gly IleSer Pro Arg Glu Ala Leu Ala Met Asp 6565 6570 6575 Pro Gln Gln Arg LeuLeu Leu Glu Thr Ser Trp Glu Ala Val Glu Asp 6580 6585 6590 Ala Gly IleAsp Pro Thr Ser Leu Gln Gly Gln Gln Val Gly Val Phe 6595 6600 6605 AlaGly Thr Asn Gly Pro His Tyr Glu Pro Leu Leu Arg Asn Thr Ala 6610 66156620 Glu Asp Leu Glu Gly Tyr Val Gly Thr Gly Asn Ala Ala Ser Ile Met6625 6630 6635 6640 Ser Gly Arg Val Ser Tyr Thr Leu Gly Leu Glu Gly ProAla Val Thr 6645 6650 6655 Val Asp Thr Ala Cys Ser Ser Ser Leu Val AlaLeu His Leu Ala Val 6660 6665 6670 Gln Ala Leu Arg Lys Gly Glu Cys GlyLeu Ala Leu Ala Gly Gly Val 6675 6680 6685 Thr Val Met Ser Thr Pro ThrThr Phe Val Glu Phe Ser Arg Gln Arg 6690 6695 6700 Gly Leu Ala Glu AspGly Arg Ser Lys Ala Phe Ala Ala Ser Ala Asp 6705 6710 6715 6720 Gly PheGly Pro Ala Glu Gly Val Gly Met Leu Leu Val Glu Arg Leu 6725 6730 6735Ser Asp Ala Arg Arg Asn Gly His Arg Val Leu Ala Val Val Arg Gly 67406745 6750 Ser Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala ProAsn 6755 6760 6765 Gly Pro Ser Gln Gln Arg Val Ile Arg Arg Ala Leu AlaAsp Ala Arg 6770 6775 6780 Leu Thr Thr Ala Asp Val Asp Val Val Glu AlaHis Gly Thr Gly Thr 6785 6790 6795 6800 Arg Leu Gly Asp Pro Ile Glu AlaGln Ala Leu Ile Ala Thr Tyr Gly 6805 6810 6815 Gln Gly Arg Asp Thr GluGln Pro Leu Arg Leu Gly Ser Leu Lys Ser 6820 6825 6830 Asn Ile Gly HisThr Gln Ala Ala Ala Gly Val Ser Gly Ile Ile Lys 6835 6840 6845 Met ValGln Ala Met Arg His Gly Val Leu Pro Lys Thr Leu His Val 6850 6855 6860Asp Arg Pro Ser Asp Gln Ile Asp Trp Ser Ala Gly Thr Val Glu Leu 68656870 6875 6880 Leu Thr Glu Ala Met Asp Trp Pro Arg Lys Gln Glu Gly GlyLeu Arg 6885 6890 6895 Arg Ala Ala Val Ser Ser Phe Gly Ile Ser Gly ThrAsn Ala His Ile 6900 6905 6910 Val Leu Glu Glu Ala Pro Val Asp Glu AspAla Pro Ala Asp Glu Pro 6915 6920 6925 Ser Val Gly Gly Val Val Pro TrpLeu Val Ser Ala Lys Thr Pro Ala 6930 6935 6940 Ala Leu Asp Ala Gln IleGly Arg Leu Ala Ala Phe Ala Ser Gln Gly 6945 6950 6955 6960 Arg Thr AspAla Ala Asp Pro Gly Ala Val Ala Arg Val Leu Ala Gly 6965 6970 6975 GlyArg Ala Gln Phe Glu His Arg Ala Val Ala Leu Gly Thr Gly Gln 6980 69856990 Asp Asp Leu Ala Ala Ala Leu Ala Ala Pro Glu Gly Leu Val Arg Gly6995 7000 7005 Val Ala Ser Gly Val Gly Arg Val Ala Phe Val Phe Pro GlyGln Gly 7010 7015 7020 Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu AspVal Ser Lys Glu 7025 7030 7035 7040 Phe Ala Ala Ala Met Ala Glu Cys GluAla Ala Leu Ala Pro Tyr Val 7045 7050 7055 Asp Trp Ser Leu Glu Ala ValVal Arg Gln Ala Pro Gly Ala Pro Thr 7060 7065 7070 Leu Glu Arg Val AspVal Val Gln Pro Val Thr Phe Ala Val Met Val 7075 7080 7085 Ser Leu AlaLys Val Trp Gln His His Gly Val Thr Pro Gln Ala Val 7090 7095 7100 ValGly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala 7105 71107115 7120 Leu Ser Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser LysSer 7125 7130 7135 Ile Gly Ala His Leu Ala Gly Gln Gly Gly Met Leu SerLeu Ala Leu 7140 7145 7150 Ser Glu Ala Ala Val Val Glu Arg Leu Ala GlyPhe Asp Gly Leu Ser 7155 7160 7165 Val Ala Ala Val Asn Gly Pro Thr AlaThr Val Val Ser Gly Asp Pro 7170 7175 7180 Thr Gln Ile Gln Glu Leu AlaGln Ala Cys Glu Ala Asp Gly Val Arg 7185 7190 7195 7200 Ala Arg Ile IlePro Val Asp Tyr Ala Ser His Ser Ala His Val Glu 7205 7210 7215 Thr IleGlu Ser Glu Leu Ala Asp Val Leu Ala Gly Leu Ser Pro Gln 7220 7225 7230Thr Pro Gln Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp Ile Thr 72357240 7245 Glu Pro Ala Leu Asp Gly Gly Tyr Trp Tyr Arg Asn Leu Arg HisArg 7250 7255 7260 Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr AspGlu Gly Phe 7265 7270 7275 7280 Thr His Phe Val Glu Val Ser Ala His ProVal Leu Thr Met Ala Leu 7285 7290 7295 Pro Glu Thr Val Thr Gly Leu GlyThr Leu Arg Arg Asp Asn Gly Gly 7300 7305 7310 Gln His Arg Leu Thr ThrSer Leu Ala Glu Ala Trp Ala Asn Gly Leu 7315 7320 7325 Thr Val Asp TrpAla Ser Leu Leu Pro Thr Thr Thr Thr His Pro Asp 7330 7335 7340 Leu ProThr Tyr Ala Phe Gln Thr Glu Arg Tyr Trp Pro Gln Pro Asp 7345 7350 73557360 Leu Ser Ala Ala Gly Asp Ile Thr Ser Ala Gly Leu Gly Ala Ala Glu7365 7370 7375 His Pro Leu Leu Gly Ala Ala Val Ala Leu Ala Asp Ser AspGly Cys 7380 7385 7390 Leu Leu Thr Gly Ser Leu Ser Leu Arg Thr His ProTrp Leu Ala Asp 7395 7400 7405 His Ala Val Ala Gly Thr Val Leu Leu ProGly Thr Ala Phe Val Glu 7410 7415 7420 Leu Ala Phe Arg Ala Gly Asp GlnVal Gly Cys Asp Leu Val Glu Glu 7425 7430 7435 7440 Leu Thr Leu Asp AlaPro Leu Val Leu Pro Arg Arg Gly Ala Val Arg 7445 7450 7455 Val Gln LeuSer Val Gly Ala Ser Asp Glu Ser Gly Arg Arg Thr Phe 7460 7465 7470 GlyLeu Tyr Ala His Pro Glu Asp Ala Pro Gly Glu Ala Glu Trp Thr 7475 74807485 Arg His Ala Thr Gly Val Leu Ala Ala Arg Ala Asp Arg Thr Ala Pro7490 7495 7500 Val Ala Asp Pro Glu Ala Trp Pro Pro Pro Gly Ala Glu ProVal Asp 7505 7510 7515 7520 Val Asp Gly Leu Tyr Glu Arg Phe Ala Ala AsnGly Tyr Gly Tyr Gly 7525 7530 7535 Pro Leu Phe Gln Gly Val Arg Gly ValTrp Arg Arg Gly Asp Glu Val 7540 7545 7550 Phe Ala Asp Val Ala Leu ProAla Glu Val Ala Gly Ala Glu Gly Ala 7555 7560 7565 Arg Phe Gly Leu HisPro Ala Leu Leu Asp Ala Ala Val Gln Ala Ala 7570 7575 7580 Gly Ala GlyArg Gly Val Arg Arg Gly His Ala Ala Ala Val Arg Leu 7585 7590 7595 7600Glu Arg Asp Leu Leu Tyr Ala Val Gly Ala Thr Ala Leu Arg Val Arg 76057610 7615 Leu Ala Pro Ala Gly Pro Asp Thr Val Ser Val Ser Ala Ala AspSer 7620 7625 7630 Ser Gly Gln Pro Val Phe Ala Ala Asp Ser Leu Thr ValLeu Pro Val 7635 7640 7645 Asp Pro Ala Gln Leu Ala Ala Phe Ser Asp ProThr Leu Asp Ala Leu 7650 7655 7660 His Leu Leu Glu Trp Thr Ala Trp AspGly Ala Ala Gln Ala Leu Pro 7665 7670 7675 7680 Gly Ala Val Val Leu GlyGly Asp Ala Asp Gly Leu Ala Ala Ala Leu 7685 7690 7695 Arg Ala Gly GlyThr Glu Val Leu Ser Phe Pro Asp Leu Thr Asp Leu 7700 7705 7710 Val GluAla Val Asp Arg Gly Glu Thr Pro Ala Pro Ala Thr Val Leu 7715 7720 7725Val Ala Cys Pro Ala Ala Gly Pro Asp Gly Pro Glu His Val Arg Glu 77307735 7740 Ala Leu His Gly Ser Leu Ala Leu Met Gln Ala Trp Leu Ala AspGlu 7745 7750 7755 7760 Arg Phe Thr Asp Gly Arg Leu Val Leu Val Thr ArgAsp Ala Val Ala 7765 7770 7775 Ala Arg Ser Gly Asp Gly Leu Arg Ser ThrGly Gln Ala Ala Val Trp 7780 7785 7790 Gly Leu Gly Arg Ser Ala Gln ThrGlu Ser Pro Gly Arg Phe Val Leu 7795 7800 7805 Leu Asp Leu Ala Gly GluAla Arg Thr Ala Gly Asp Ala Thr Ala Gly 7810 7815 7820 Asp Gly Leu ThrThr Gly Asp Ala Thr Val Gly Gly Thr Ser Gly Asp 7825 7830 7835 7840 AlaAla Leu Gly Ser Ala Leu Ala Thr Ala Leu Gly Ser Gly Glu Pro 7845 78507855 Gln Leu Ala Leu Arg Asp Gly Ala Leu Leu Val Pro Arg Leu Ala Arg7860 7865 7870 Ala Ala Ala Pro Ala Ala Ala Asp Gly Leu Ala Ala Ala AspGly Leu 7875 7880 7885 Ala Ala Leu Pro Leu Pro Ala Ala Pro Ala Leu TrpArg Leu Glu Pro 7890 7895 7900 Gly Thr Asp Gly Ser Leu Glu Ser Leu ThrAla Ala Pro Gly Asp Ala 7905 7910 7915 7920 Glu Thr Leu Ala Pro Glu ProLeu Gly Pro Gly Gln Val Arg Ile Ala 7925 7930 7935 Ile Arg Ala Thr GlyLeu Asn Phe Arg Asp Val Leu Ile Ala Leu Gly 7940 7945 7950 Met Tyr ProAsp Pro Ala Leu Met Gly Thr Glu Gly Ala Gly Val Val 7955 7960 7965 ThrAla Thr Gly Pro Gly Val Thr His Leu Ala Pro Gly Asp Arg Val 7970 79757980 Met Gly Leu Leu Ser Gly Ala Tyr Ala Pro Val Val Val Ala Asp Ala7985 7990 7995 8000 Arg Thr Val Ala Arg Met Pro Glu Gly Trp Thr Phe AlaGln Gly Ala 8005 8010 8015 Ser Val Pro Val Val Phe Leu Thr Ala Val TyrAla Leu Arg Asp Leu 8020 8025 8030 Ala Asp Val Lys Pro Gly Glu Arg LeuLeu Val His Ser Ala Ala Gly 8035 8040 8045 Gly Val Gly Met Ala Ala ValGln Leu Ala Arg His Trp Gly Val Glu 8050 8055 8060 Val His Gly Thr AlaSer His Gly Lys Trp Asp Ala Leu Arg Ala Leu 8065 8070 8075 8080 Gly LeuAsp Asp Ala His Ile Ala Ser Ser Arg Thr Leu Asp Phe Glu 8085 8090 8095Ser Ala Phe Arg Ala Ala Ser Gly Gly Ala Gly Met Asp Val Val Leu 81008105 8110 Asn Ser Leu Ala Arg Glu Phe Val Asp Ala Ser Leu Arg Leu LeuGly 8115 8120 8125 Pro Gly Gly Arg Phe Val Glu Met Gly Lys Thr Asp ValArg Asp Ala 8130 8135 8140 Glu Arg Val Ala Ala Asp His Pro Gly Val GlyTyr Arg Ala Phe Asp 8145 8150 8155 8160 Leu Gly Glu Ala Gly Pro Glu ArgIle Gly Glu Met Leu Ala Glu Val 8165 8170 8175 Ile Ala Leu Phe Glu AspGly Val Leu Arg His Leu Pro Val Thr Thr 8180 8185 8190 Trp Asp Val ArgArg Ala Arg Asp Ala Phe Arg His Val Ser Gln Ala 8195 8200 8205 Arg HisThr Gly Lys Val Val Leu Thr Met Pro Ser Gly Leu Asp Pro 8210 8215 8220Glu Gly Thr Val Leu Leu Thr Gly Gly Thr Gly Ala Leu Gly Gly Ile 82258230 8235 8240 Val Ala Arg His Val Val Gly Glu Trp Gly Val Arg Arg LeuLeu Leu 8245 8250 8255 Val Ser Arg Arg Gly Thr Asp Ala Pro Gly Ala GlyGlu Leu Val His 8260 8265 8270 Glu Leu Glu Ala Leu Gly Ala Asp Val SerVal Ala Ala Cys Asp Val 8275 8280 8285 Ala Asp Arg Glu Ala Leu Thr AlaVal Leu Asp Ser Ile Pro Ala Glu 8290 8295 8300 His Pro Leu Thr Ala ValVal His Thr Ala Gly Val Leu Ser Asp Gly 8305 8310 8315 8320 Thr Leu ProSer Met Thr Ala Glu Asp Val Glu His Val Leu Arg Pro 8325 8330 8335 LysVal Asp Ala Ala Phe Leu Leu Asp Glu Leu Thr Ser Thr Pro Gly 8340 83458350 Tyr Asp Leu Ala Ala Phe Val Met Phe Ser Ser Ala Ala Ala Val Phe8355 8360 8365 Gly Gly Ala Gly Gln Gly Ala Tyr Ala Ala Ala Asn Ala ThrLeu Asp 8370 8375 8380 Ala Leu Ala Trp Arg Arg Arg Thr Ala Gly Leu ProAla Leu Ser Leu 8385 8390 8395 8400 Gly Trp Gly Leu Trp Ala Glu Thr SerGly Met Thr Gly Gly Leu Ser 8405 8410 8415 Asp Thr Asp Arg Ser Arg LeuAla Arg Ser Gly Ala Thr Pro Met Asp 8420 8425 8430 Ser Glu Leu Thr LeuSer Leu Leu Asp Ala Ala Met Arg Arg Asp Asp 8435 8440 8445 Pro Ala LeuVal Pro Ile Ala Leu Asp Val Ala Ala Leu Arg Ala Gln 8450 8455 8460 GlnArg Asp Gly Met Leu Ala Pro Leu Leu Ser Gly Leu Thr Arg Gly 8465 84708475 8480 Ser Arg Val Gly Gly Ala Pro Val Asn Gln Arg Arg Ala Ala AlaGly 8485 8490 8495 Gly Ala Gly Glu Ala Asp Thr Asp Leu Gly Gly Arg LeuAla Ala Met 8500 8505 8510 Thr Pro Asp Asp Arg Val Ala His Leu Arg AspLeu Val Arg Thr His 8515 8520 8525 Val Ala Thr Val Leu Gly His Gly ThrPro Ser Arg Val Asp Leu Glu 8530 8535 8540 Arg Ala Phe Arg Asp Thr GlyPhe Asp Ser Leu Thr Ala Val Glu Leu 8545 8550 8555 8560 Arg Asn Arg LeuAsn Ala Ala Thr Gly Leu Arg Leu Pro Ala Thr Leu 8565 8570 8575 Val PheAsp His Pro Thr Pro Gly Glu Leu Ala Gly His Leu Leu Asp 8580 8585 8590Glu Leu Ala Thr Ala Ala Gly Gly Ser Trp Ala Glu Gly Thr Gly Ser 85958600 8605 Gly Asp Thr Ala Ser Ala Thr Asp Arg Gln Thr Thr Ala Ala LeuAla 8610 8615 8620 Glu Leu Asp Arg Leu Glu Gly Val Leu Ala Ser Leu AlaPro Ala Ala 8625 8630 8635 8640 Gly Gly Arg Pro Glu Leu Ala Ala Arg LeuArg Ala Leu Ala Ala Ala 8645 8650 8655 Leu Gly Asp Asp Gly Asp Asp AlaThr Asp Leu Asp Glu Ala Ser Asp 8660 8665 8670 Asp Asp Leu Phe Ser PheIle Asp Lys Glu Leu Gly Asp Ser Asp Phe 8675 8680 8685 Met Ala Asn AsnGlu Asp Lys Leu Arg Asp Tyr Leu Lys Arg Val Thr 8690 8695 8700 Ala GluLeu Gln Gln Asn Thr Arg Arg Leu Arg Glu Ile Glu Gly Arg 8705 8710 87158720 Thr His Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly8725 8730 8735 Gly Val Ala Ser Pro Glu Asp Leu Trp Gln Leu Val Ala GlyAsp Gly 8740 8745 8750 Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly TrpAsp Val Glu Gly 8755 8760 8765 Leu Tyr Asp Pro Asp Pro Asp Ala Ser GlyArg Thr Tyr Cys Arg Ser 8770 8775 8780 Gly Gly Phe Leu His Asp Ala GlyGlu Phe Asp Ala Asp Phe Phe Gly 8785 8790 8795 8800 Ile Ser Pro Arg GluAla Leu Ala Met Asp Pro Gln Gln Arg Leu Ser 8805 8810 8815 Leu Thr ThrAla Trp Glu Ala Ile Glu Ser Ala Gly Ile Asp Pro Thr 8820 8825 8830 AlaLeu Lys Gly Ser Gly Leu Gly Val Phe Val Gly Gly Trp His Thr 8835 88408845 Gly Tyr Thr Ser Gly Gln Thr Thr Ala Val Gln Ser Pro Glu Leu Glu8850 8855 8860 Gly His Leu Val Ser Gly Ala Ala Leu Gly Phe Leu Ser GlyArg Ile 8865 8870 8875 8880 Ala Tyr Val Leu Gly Thr Asp Gly Pro Ala LeuThr Val Asp Thr Ala 8885 8890 8895 Cys Ser Ser Ser Leu Val Ala Leu HisLeu Ala Val Gln Ala Leu Arg 8900 8905 8910 Lys Gly Glu Cys Asp Met AlaLeu Ala Gly Gly Val Thr Val Met Pro 8915 8920 8925 Asn Ala Asp Leu PheVal Gln Phe Ser Arg Gln Arg Gly Leu Ala Ala 8930 8935 8940 Asp Gly ArgSer Lys Ala Phe Ala Thr Ser Ala Asp Gly Phe Gly Pro 8945 8950 8955 8960Ala Glu Gly Ala Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg 89658970 8975 Arg Asn Gly His Arg Ile Leu Ala Val Val Arg Gly Ser Ala ValAsn 8980 8985 8990 Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro His GlyPro Ser Gln 8995 9000 9005 Gln Arg Val Ile Arg Arg Ala Leu Ala Asp AlaArg Leu Ala Pro Gly 9010 9015 9020 Asp Val Asp Val Val Glu Ala His GlyThr Gly Thr Arg Leu Gly Asp 9025 9030 9035 9040 Pro Ile Glu Ala Gln AlaLeu Ile Ala Thr Tyr Gly Gln Glu Lys Ser 9045 9050 9055 Ser Glu Gln ProLeu Arg Leu Gly Ala Leu Lys Ser Asn Ile Gly His 9060 9065 9070 Thr GlnAla Ala Ala Gly Val Ala Gly Val Ile Lys Met Val Gln Ala 9075 9080 9085Met Arg His Gly Leu Leu Pro Lys Thr Leu His Val Asp Glu Pro Ser 90909095 9100 Asp Gln Ile Asp Trp Ser Ala Gly Thr Val Glu Leu Leu Thr GluAla 9105 9110 9115 9120 Val Asp Trp Pro Glu Lys Gln Asp Gly Gly Leu ArgArg Ala Ala Val 9125 9130 9135 Ser Ser Phe Gly Ile Ser Gly Thr Asn AlaHis Val Val Leu Glu Glu 9140 9145 9150 Ala Pro Ala Val Glu Asp Ser ProAla Val Glu Pro Pro Ala Gly Gly 9155 9160 9165 Gly Val Val Pro Trp ProVal Ser Ala Lys Thr Pro Ala Ala Leu Asp 9170 9175 9180 Ala Gln Ile GlyGln Leu Ala Ala Tyr Ala Asp Gly Arg Thr Asp Val 9185 9190 9195 9200 AspPro Ala Val Ala Ala Arg Ala Leu Val Asp Ser Arg Thr Ala Met 9205 92109215 Glu His Arg Ala Val Ala Val Gly Asp Ser Arg Glu Ala Leu Arg Asp9220 9225 9230 Ala Leu Arg Met Pro Glu Gly Leu Val Arg Gly Thr Ser SerAsp Val 9235 9240 9245 Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly ThrGln Trp Ala Gly 9250 9255 9260 Met Gly Ala Glu Leu Leu Asp Ser Ser ProGlu Phe Ala Ala Ser Met 9265 9270 9275 9280 Ala Glu Cys Glu Thr Ala LeuSer Arg Tyr Val Asp Trp Ser Leu Glu 9285 9290 9295 Ala Val Val Arg GlnGlu Pro Gly Ala Pro Thr Leu Asp Arg Val Asp 9300 9305 9310 Val Val GlnPro Val Thr Phe Ala Val Met Val Ser Leu Ala Lys Val 9315 9320 9325 TrpGln His His Gly Ile Thr Pro Gln Ala Val Val Gly His Ser Gln 9330 93359340 Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Thr Leu Asp Asp9345 9350 9355 9360 Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile AlaAla His Leu 9365 9370 9375 Ala Gly Lys Gly Gly Met Ile Ser Leu Ala LeuAsp Glu Ala Ala Val 9380 9385 9390 Leu Lys Arg Leu Ser Asp Phe Asp GlyLeu Ser Val Ala Ala Val Asn 9395 9400 9405 Gly Pro Thr Ala Thr Val ValSer Gly Asp Pro Thr Gln Ile Glu Glu 9410 9415 9420 Leu Ala Arg Thr CysGlu Ala Asp Gly Val Arg Ala Arg Ile Ile Pro 9425 9430 9435 9440 Val AspTyr Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Lys Glu 9445 9450 9455Leu Ala Glu Val Leu Ala Gly Leu Ala Pro Gln Ala Pro His Val Pro 94609465 9470 Phe Phe Ser Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val LeuAsp 9475 9480 9485 Gly Thr Tyr Trp Tyr Arg Asn Leu Arg His Arg Val GlyPhe Ala Pro 9490 9495 9500 Ala Val Glu Thr Leu Ala Val Asp Gly Phe ThrHis Phe Ile Glu Val 9505 9510 9515 9520 Ser Ala His Pro Val Leu Thr MetThr Leu Pro Glu Thr Val Thr Gly 9525 9530 9535 Leu Gly Thr Leu Arg ArgGlu Gln Gly Gly Gln Glu Arg Leu Val Thr 9540 9545 9550 Ser Leu Ala GluAla Trp Ala Asn Gly Leu Thr Ile Asp Trp Ala Pro 9555 9560 9565 Ile LeuPro Thr Ala Thr Gly His His Pro Glu Leu Pro Thr Tyr Ala 9570 9575 9580Phe Gln Thr Glu Arg Phe Trp Leu Gln Ser Ser Ala Pro Thr Ser Ala 95859590 9595 9600 Ala Asp Asp Trp Arg Tyr Arg Val Glu Trp Lys Pro Leu ThrAla Ser 9605 9610 9615 Gly Gln Ala Asp Leu Ser Gly Arg Trp Ile Val AlaVal Gly Ser Glu 9620 9625 9630 Pro Glu Ala Glu Leu Leu Gly Ala Leu LysAla Ala Gly Ala Glu Val 9635 9640 9645 Asp Val Leu Glu Ala Gly Ala AspAsp Asp Arg Glu Ala Leu Ala Ala 9650 9655 9660 Arg Leu Thr Ala Leu ThrThr Gly Asp Gly Phe Thr Gly Val Val Ser 9665 9670 9675 9680 Leu Leu AspAsp Leu Val Pro Gln Val Ala Trp Val Gln Ala Leu Gly 9685 9690 9695 AspAla Gly Ile Lys Ala Pro Leu Trp Ser Val Thr Gln Gly Ala Val 9700 97059710 Ser Val Gly Arg Leu Asp Thr Pro Ala Asp Pro Asp Arg Ala Met Leu9715 9720 9725 Trp Gly Leu Gly Arg Val Val Ala Leu Glu His Pro Glu ArgTrp Ala 9730 9735 9740 Gly Leu Val Asp Leu Pro Ala Gln Pro Asp Ala AlaAla Leu Ala His 9745 9750 9755 9760 Leu Val Thr Ala Leu Ser Gly Ala ThrGly Glu Asp Gln Ile Ala Ile 9765 9770 9775 Arg Thr Thr Gly Leu His AlaArg Arg Leu Ala Arg Ala Pro Leu His 9780 9785 9790 Gly Arg Arg Pro ThrArg Asp Trp Gln Pro His Gly Thr Val Leu Ile 9795 9800 9805 Thr Gly GlyThr Gly Ala Leu Gly Ser His Ala Ala Arg Trp Met Ala 9810 9815 9820 HisHis Gly Ala Glu His Leu Leu Leu Val Ser Arg Ser Gly Glu Gln 9825 98309835 9840 Ala Pro Gly Ala Thr Gln Leu Thr Ala Glu Leu Thr Ala Ser GlyAla 9845 9850 9855 Arg Val Thr Ile Ala Ala Cys Asp Val Ala Asp Pro HisAla Met Arg 9860 9865 9870 Thr Leu Leu Asp Ala Ile Pro Ala Glu Thr ProLeu Thr Ala Val Val 9875 9880 9885 His Thr Ala Gly Ala Pro Gly Gly AspPro Leu Asp Val Thr Gly Pro 9890 9895 9900 Glu Asp Ile Ala Arg Ile LeuGly Ala Lys Thr Ser Gly Ala Glu Val 9905 9910 9915 9920 Leu Asp Asp LeuLeu Arg Gly Thr Pro Leu Asp Ala Phe Val Leu Tyr 9925 9930 9935 Ser SerAsn Ala Gly Val Trp Gly Ser Gly Ser Gln Gly Val Tyr Ala 9940 9945 9950Ala Ala Asn Ala His Leu Asp Ala Leu Ala Ala Arg Arg Arg Ala Arg 99559960 9965 Gly Glu Thr Ala Thr Ser Val Ala Trp Gly Leu Trp Ala Gly AspGly 9970 9975 9980 Met Gly Arg Gly Ala Asp Asp Ala Tyr Trp Gln Arg ArgGly Ile Arg 9985 9990 9995 10000 Pro Met Ser Pro Asp Arg Ala Leu Asp GluLeu Ala Lys Ala Leu Ser 10005 10010 10015 His Asp Glu Thr Phe Val AlaVal Ala Asp Val Asp Trp Glu Arg Phe 10020 10025 10030 Ala Pro Ala PheThr Val Ser Arg Pro Ser Leu Leu Leu Asp Gly Val 10035 10040 10045 ProGlu Ala Arg Gln Ala Leu Ala Ala Pro Val Gly Ala Pro Ala Pro 10050 1005510060 Gly Asp Ala Ala Val Ala Pro Thr Gly Gln Ser Ser Ala Leu Ala Ala10065 10070 10075 10080 Ile Thr Ala Leu Pro Glu Pro Glu Arg Arg Pro AlaLeu Leu Thr Leu 10085 10090 10095 Val Arg Thr His Ala Ala Ala Val LeuGly His Ser Ser Pro Asp Arg 10100 10105 10110 Val Ala Pro Gly Arg AlaPhe Thr Glu Leu Gly Phe Asp Ser Leu Thr 10115 10120 10125 Ala Val GlnLeu Arg Asn Gln Leu Ser Thr Val Val Gly Asn Arg Leu 10130 10135 10140Pro Ala Thr Thr Val Phe Asp His Pro Thr Pro Ala Ala Leu Ala Ala 1014510150 10155 10160 His Leu His Glu Ala Tyr Leu Ala Pro Ala Glu Pro AlaPro Thr Asp 10165 10170 10175 Trp Glu Gly Arg Val Arg Arg Ala Leu AlaGlu Leu Pro Leu Asp Arg 10180 10185 10190 Leu Arg Asp Ala Gly Val LeuAsp Thr Val Leu Arg Leu Thr Gly Ile 10195 10200 10205 Glu Pro Glu ProGly Ser Gly Gly Ser Asp Gly Gly Ala Ala Asp Pro 10210 10215 10220 GlyAla Glu Pro Glu Ala Ser Ile Asp Asp Leu Asp Ala Glu Ala Leu 10225 1023010235 10240 Ile Arg Met Ala Leu Gly Pro Arg Asn Thr Met Thr Ser Ser AsnGlu 10245 10250 10255 Gln Leu Val Asp Ala Leu Arg Ala Ser Leu Lys GluAsn Glu Glu Leu 10260 10265 10270 Arg Lys Glu Ser Arg Arg Arg Ala AspArg Arg Gln Glu Pro Met Ala 10275 10280 10285 Ile Val Gly Met Ser CysArg Phe Ala Gly Gly Ile Arg Ser Pro Glu 10290 10295 10300 Asp Leu TrpAsp Ala Val Ala Ala Gly Lys Asp Leu Val Ser Glu Val 10305 10310 1031510320 Pro Glu Glu Arg Gly Trp Asp Ile Asp Ser Leu Tyr Asp Pro Val Pro10325 10330 10335 Gly Arg Lys Gly Thr Thr Tyr Val Arg Asn Ala Ala PheLeu Asp Asp 10340 10345 10350 Ala Ala Gly Phe Asp Ala Ala Phe Phe GlyIle Ser Pro Arg Glu Ala 10355 10360 10365 Leu Ala Met Asp Pro Gln GlnArg Gln Leu Leu Glu Ala Ser Trp Glu 10370 10375 10380 Val Phe Glu ArgAla Gly Ile Asp Pro Ala Ser Val Arg Gly Thr Asp 10385 10390 10395 10400Val Gly Val Tyr Val Gly Cys Gly Tyr Gln Asp Tyr Ala Pro Asp Ile 1040510410 10415 Arg Val Ala Pro Glu Gly Thr Gly Gly Tyr Val Val Thr Gly AsnSer 10420 10425 10430 Ser Ala Val Ala Ser Gly Arg Ile Ala Tyr Ser LeuGly Leu Glu Gly 10435 10440 10445 Pro Ala Val Thr Val Asp Thr Ala CysSer Ser Ser Leu Val Ala Leu 10450 10455 10460 His Leu Ala Leu Lys GlyLeu Arg Asn Gly Asp Cys Ser Thr Ala Leu 10465 10470 10475 10480 Val GlyGly Val Ala Val Leu Ala Thr Pro Gly Ala Phe Ile Glu Phe 10485 1049010495 Ser Ser Gln Gln Ala Met Ala Ala Asp Gly Arg Thr Lys Gly Phe Ala10500 10505 10510 Ser Ala Ala Asp Gly Leu Ala Trp Gly Glu Gly Val AlaVal Leu Leu 10515 10520 10525 Leu Glu Arg Leu Ser Asp Ala Arg Arg LysGly His Arg Val Leu Ala 10530 10535 10540 Val Val Arg Gly Ser Ala IleAsn Gln Asp Gly Ala Ser Asn Gly Leu 10545 10550 10555 10560 Thr Ala ProHis Gly Pro Ser Gln Gln His Leu Ile Arg Gln Ala Leu 10565 10570 10575Ala Asp Ala Arg Leu Thr Ser Ser Asp Val Asp Val Val Glu Gly His 1058010585 10590 Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala LeuLeu 10595 10600 10605 Ala Thr Tyr Gly Gln Gly Arg Ala Pro Gly Gln ProLeu Arg Leu Gly 10610 10615 10620 Thr Leu Lys Ser Asn Ile Gly His ThrGln Ala Ala Ser Gly Val Ala 10625 10630 10635 10640 Gly Val Ile Lys MetVal Gln Ala Leu Arg His Gly Val Leu Pro Lys 10645 10650 10655 Thr LeuHis Val Asp Glu Pro Thr Asp Gln Val Asp Trp Ser Ala Gly 10660 1066510670 Ser Val Glu Leu Leu Thr Glu Ala Val Asp Trp Pro Glu Arg Pro Gly10675 10680 10685 Arg Leu Arg Arg Ala Gly Val Ser Ala Phe Gly Val GlyGly Thr Asn 10690 10695 10700 Ala His Val Val Leu Glu Glu Ala Pro AlaVal Glu Glu Ser Pro Ala 10705 10710 10715 10720 Val Glu Pro Pro Ala GlyGly Gly Val Val Pro Trp Pro Val Ser Ala 10725 10730 10735 Lys Thr SerAla Ala Leu Asp Ala Gln Ile Gly Gln Leu Ala Ala Tyr 10740 10745 10750Ala Glu Asp Arg Thr Asp Val Asp Pro Ala Val Ala Ala Arg Ala Leu 1075510760 10765 Val Asp Ser Arg Thr Ala Met Glu His Arg Ala Val Ala Val GlyAsp 10770 10775 10780 Ser Arg Glu Ala Leu Arg Asp Ala Leu Arg Met ProGlu Gly Leu Val 10785 10790 10795 10800 Arg Gly Thr Val Thr Asp Pro GlyArg Val Ala Phe Val Phe Pro Gly 10805 10810 10815 Gln Gly Thr Gln TrpAla Gly Met Gly Ala Glu Leu Leu Asp Ser Ser 10820 10825 10830 Pro GluPhe Ala Ala Ala Met Ala Glu Cys Glu Thr Ala Leu Ser Pro 10835 1084010845 Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro Ser Ala10850 10855 10860 Pro Thr Leu Asp Arg Val Asp Val Val Gln Pro Val ThrPhe Ala Val 10865 10870 10875 10880 Met Val Ser Leu Ala Lys Val Trp GlnHis His Gly Ile Thr Pro Glu 10885 10890 10895 Ala Val Ile Gly His SerGln Gly Glu Ile Ala Ala Ala Tyr Val Ala 10900 10905 10910 Gly Ala LeuThr Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser 10915 10920 10925Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Ile Ser Leu 1093010935 10940 Ala Leu Ser Glu Glu Ala Thr Arg Gln Arg Ile Glu Asn Leu HisGly 10945 10950 10955 10960 Leu Ser Ile Ala Ala Val Asn Gly Pro Thr AlaThr Val Val Ser Gly 10965 10970 10975 Asp Pro Thr Gln Ile Gln Glu LeuAla Gln Ala Cys Glu Ala Asp Gly 10980 10985 10990 Ile Arg Ala Arg IleIle Pro Val Asp Tyr Ala Ser His Ser Ala His 10995 11000 11005 Val GluThr Ile Glu Asn Glu Leu Ala Asp Val Leu Ala Gly Leu Ser 11010 1101511020 Pro Gln Thr Pro Gln Val Pro Phe Phe Ser Thr Leu Glu Gly Thr Trp11025 11030 11035 11040 Ile Thr Glu Pro Ala Leu Asp Gly Gly Tyr Trp TyrArg Asn Leu Arg 11045 11050 11055 His Arg Val Gly Phe Ala Pro Ala ValGlu Thr Leu Ala Thr Asp Glu 11060 11065 11070 Gly Phe Thr His Phe IleGlu Val Ser Ala His Pro Val Leu Thr Met 11075 11080 11085 Thr Leu ProAsp Lys Val Thr Gly Leu Ala Thr Leu Arg Arg Glu Asp 11090 11095 11100Gly Gly Gln His Arg Leu Thr Thr Ser Leu Ala Glu Ala Trp Ala Asn 1110511110 11115 11120 Gly Leu Ala Leu Asp Trp Ala Ser Leu Leu Pro Ala ThrGly Ala Leu 11125 11130 11135 Ser Pro Ala Val Pro Asp Leu Pro Thr TyrAla Phe Gln His Arg Ser 11140 11145 11150 Tyr Trp Ile Ser Pro Ala GlyPro Gly Glu Ala Pro Ala His Thr Ala 11155 11160 11165 Ser Gly Arg GluAla Val Ala Glu Thr Gly Leu Ala Trp Gly Pro Gly 11170 11175 11180 AlaGlu Asp Leu Asp Glu Glu Gly Arg Arg Ser Ala Val Leu Ala Met 11185 1119011195 11200 Val Met Arg Gln Ala Ala Ser Val Leu Arg Cys Asp Ser Pro GluGlu 11205 11210 11215 Val Pro Val Asp Arg Pro Leu Arg Glu Ile Gly PheAsp Ser Leu Thr 11220 11225 11230 Ala Val Asp Phe Arg Asn Arg Val AsnArg Leu Thr Gly Leu Gln Leu 11235 11240 11245 Pro Pro Thr Val Val PheGln His Pro Thr Pro Val Ala Leu Ala Glu 11250 11255 11260 Arg Ile SerAsp Glu Leu Ala Glu Arg Asn Trp Ala Val Ala Glu Pro 11265 11270 1127511280 Ser Asp His Glu Gln Ala Glu Glu Glu Lys Ala Ala Ala Pro Ala Gly11285 11290 11295 Ala Arg Ser Gly Ala Asp Thr Gly Ala Gly Ala Gly MetPhe Arg Ala 11300 11305 11310 Leu Phe Arg Gln Ala Val Glu Asp Asp ArgTyr Gly Glu Phe Leu Asp 11315 11320 11325 Val Leu Ala Glu Ala Ser AlaPhe Arg Pro Gln Phe Ala Ser Pro Glu 11330 11335 11340 Ala Cys Ser GluArg Leu Asp Pro Val Leu Leu Ala Gly Gly Pro Thr 11345 11350 11355 11360Asp Arg Ala Glu Gly Arg Ala Val Leu Val Gly Cys Thr Gly Thr Ala 1136511370 11375 Ala Asn Gly Gly Pro His Glu Phe Leu Arg Leu Ser Thr Ser PheGln 11380 11385 11390 Glu Glu Arg Asp Phe Leu Ala Val Pro Leu Pro GlyTyr Gly Thr Gly 11395 11400 11405 Thr Gly Thr Gly Thr Ala Leu Leu ProAla Asp Leu Asp Thr Ala Leu 11410 11415 11420 Asp Ala Gln Ala Arg AlaIle Leu Arg Ala Ala Gly Asp Ala Pro Val 11425 11430 11435 11440 Val LeuLeu Gly His Ser Gly Gly Ala Leu Leu Ala His Glu Leu Ala 11445 1145011455 Phe Arg Leu Glu Arg Ala His Gly Ala Pro Pro Ala Gly Ile Val Leu11460 11465 11470 Val Asp Pro Tyr Pro Pro Gly His Gln Glu Pro Ile GluVal Trp Ser 11475 11480 11485 Arg Gln Leu Gly Glu Gly Leu Phe Ala GlyGlu Leu Glu Pro Met Ser 11490 11495 11500 Asp Ala Arg Leu Leu Ala MetGly Arg Tyr Ala Arg Phe Leu Ala Gly 11505 11510 11515 11520 Pro Arg ProGly Arg Ser Ser Ala Pro Val Leu Leu Val Arg Ala Ser 11525 11530 11535Glu Pro Leu Gly Asp Trp Gln Glu Glu Arg Gly Asp Trp Arg Ala His 1154011545 11550 Trp Asp Leu Pro His Thr Val Ala Asp Val Pro Gly Asp His PheThr 11555 11560 11565 Met Met Arg Asp His Ala Pro Ala Val Ala Glu AlaVal Leu Ser Trp 11570 11575 11580 Leu Asp Ala Ile Glu Gly Ile Glu GlyAla Gly Lys Met Thr Asp Arg 11585 11590 11595 11600 Pro Leu Asn Val AspSer Gly Leu Trp Ile Arg Arg Phe His Pro Ala 11605 11610 11615 Pro AsnSer Ala Val Arg Leu Val Cys Leu Pro His Ala Gly Gly Ser 11620 1162511630 Ala Ser Tyr Phe Phe Arg Phe Ser Glu Glu Leu His Pro Ser Val Glu11635 11640 11645 Ala Leu Ser Val Gln Tyr Pro Gly Arg Gln Asp Arg ArgAla Glu Pro 11650 11655 11660 Cys Leu Glu Ser Val Glu Glu Leu Ala GluHis Val Val Ala Ala Thr 11665 11670 11675 11680 Glu Pro Trp Trp Gln GluGly Arg Leu Ala Phe Phe Gly His Ser Leu 11685 11690 11695 Gly Ala SerVal Ala Phe Glu Thr Ala Arg Ile Leu Glu Gln Arg His 11700 11705 11710Gly Val Arg Pro Glu Gly Leu Tyr Val Ser Gly Arg Arg Ala Pro Ser 1171511720 11725 Leu Ala Pro Asp Arg Leu Val His Gln Leu Asp Asp Arg Ala PheLeu 11730 11735 11740 Ala Glu Ile Arg Arg Leu Ser Gly Thr Asp Glu ArgPhe Leu Gln Asp 11745 11750 11755 11760 Asp Glu Leu Leu Arg Leu Val LeuPro Ala Leu Arg Ser Asp Tyr Lys 11765 11770 11775 Ala Ala Glu Thr TyrLeu His Arg Pro Ser Ala Lys Leu Thr Cys Pro 11780 11785 11790 Val MetAla Leu Ala Gly Asp Arg Asp Pro Lys Ala Pro Leu Asn Glu 11795 1180011805 Val Ala Glu Trp Arg Arg His Thr Ser Gly Pro Phe Cys Leu Arg Ala11810 11815 11820 Tyr Ser Gly Gly His Phe Tyr Leu Asn Asp Gln Trp HisGlu Ile Cys 11825 11830 11835 11840 Asn Asp Ile Ser Asp His Leu Leu ValThr Arg Gly Ala Pro Asp Ala 11845 11850 11855 Arg Val Val Gln Pro ProThr Ser Leu Ile Glu Gly Ala Ala Lys Arg 11860 11865 11870 Trp Gln AsnPro Arg 11875 7 1248 DNA Streptomyces venezuelae 7 gtgaaaagcg ccttatccgacctcgcattc ttcggcggcc ccgccgcttt cgaccagccg 60 ctcctcgtgg ggcggcccaaccgcatcgac cgcgccaggc tgtacgagcg gctcgaccgg 120 gccctcgaca gccagtggctgtccaacggc ggcccgctcg tccgcgagtt cgaggagcgc 180 gtcgccgggc tcgccggggtccggcatgcc gtggccacct gcaacgccac ggccgggctc 240 cagctcctcg cgcacgccgccggcctcacc ggcgaagtga tcatgccgtc gatgacgttc 300 gccgccaccc cgcacgcactgcgctggatc ggcctcaccc cggtcttcgc cgacatcgac 360 ccggacaccg gcaacctcgacccggaccag gtggccgccg cggtcacacc ccgcacctcg 420 gccgtcgtcg gcgtccacctctggggccgc ccctgcgccg ccgaccagct gcggaaggtc 480 gccgacgagc acggcctgcggctgtacttc gacgccgcgc acgccctcgg ctgcgcggtc 540 gacggccggc ccgccggcagcctcggcgac gccgaggtct tcagcttcca cgccaccaag 600 gccgtcaacg ccttcgagggcggcgccgtc gtcaccgacg acgccgacct cgccgcccgg 660 atccgcgccc tccacaacttcggcttcgac ctgcccggcg gcagccccgc cggcgggacc 720 aacgccaaga tgagcgaggccgccgccgcc atgggcctca cctccctcga cgcgtttccc 780 gaggtcatcg accggaaccggcgcaaccac gccgcctacc gcgagcacct cgcggacctc 840 cccggcgtcc tcgtcgccgaccacgaccgc cacggcctca acaaccacca gtacgtgatc 900 gtcgagatcg acgaggccaccaccggcatc caccgcgacc tcgtcatgga ggtcctgaag 960 gccgaaggcg tgcacacccgcgcctacttc tcgccgggct gccacgagct ggagccgtac 1020 cgcgggcagc cgcacgccccgctgccgcac accgaacgcc tcgccgcgcg cgtgctgtcc 1080 ctgccgaccg gcaccgccatcggcgacgac gacatccgcc gggtcgccga cctgctgcgt 1140 ctctgcgcga cccgcggccgcgaactgacc gcgcgccacc gcgacacggc ccccgccccg 1200 ctcgcggccc cccagacatccacgcccacg attggacgct cccgatga 1248 8 415 PRT Streptomyces venezuelae 8Met Lys Ser Ala Leu Ser Asp Leu Ala Phe Phe Gly Gly Pro Ala Ala 1 5 1015 Phe Asp Gln Pro Leu Leu Val Gly Arg Pro Asn Arg Ile Asp Arg Ala 20 2530 Arg Leu Tyr Glu Arg Leu Asp Arg Ala Leu Asp Ser Gln Trp Leu Ser 35 4045 Asn Gly Gly Pro Leu Val Arg Glu Phe Glu Glu Arg Val Ala Gly Leu 50 5560 Ala Gly Val Arg His Ala Val Ala Thr Cys Asn Ala Thr Ala Gly Leu 65 7075 80 Gln Leu Leu Ala His Ala Ala Gly Leu Thr Gly Glu Val Ile Met Pro 8590 95 Ser Met Thr Phe Ala Ala Thr Pro His Ala Leu Arg Trp Ile Gly Leu100 105 110 Thr Pro Val Phe Ala Asp Ile Asp Pro Asp Thr Gly Asn Leu AspPro 115 120 125 Asp Gln Val Ala Ala Ala Val Thr Pro Arg Thr Ser Ala ValVal Gly 130 135 140 Val His Leu Trp Gly Arg Pro Cys Ala Ala Asp Gln LeuArg Lys Val 145 150 155 160 Ala Asp Glu His Gly Leu Arg Leu Tyr Phe AspAla Ala His Ala Leu 165 170 175 Gly Cys Ala Val Asp Gly Arg Pro Ala GlySer Leu Gly Asp Ala Glu 180 185 190 Val Phe Ser Phe His Ala Thr Lys AlaVal Asn Ala Phe Glu Gly Gly 195 200 205 Ala Val Val Thr Asp Asp Ala AspLeu Ala Ala Arg Ile Arg Ala Leu 210 215 220 His Asn Phe Gly Phe Asp LeuPro Gly Gly Ser Pro Ala Gly Gly Thr 225 230 235 240 Asn Ala Lys Met SerGlu Ala Ala Ala Ala Met Gly Leu Thr Ser Leu 245 250 255 Asp Ala Phe ProGlu Val Ile Asp Arg Asn Arg Arg Asn His Ala Ala 260 265 270 Tyr Arg GluHis Leu Ala Asp Leu Pro Gly Val Leu Val Ala Asp His 275 280 285 Asp ArgHis Gly Leu Asn Asn His Gln Tyr Val Ile Val Glu Ile Asp 290 295 300 GluAla Thr Thr Gly Ile His Arg Asp Leu Val Met Glu Val Leu Lys 305 310 315320 Ala Glu Gly Val His Thr Arg Ala Tyr Phe Ser Pro Gly Cys His Glu 325330 335 Leu Glu Pro Tyr Arg Gly Gln Pro His Ala Pro Leu Pro His Thr Glu340 345 350 Arg Leu Ala Ala Arg Val Leu Ser Leu Pro Thr Gly Thr Ala IleGly 355 360 365 Asp Asp Asp Ile Arg Arg Val Ala Asp Leu Leu Arg Leu CysAla Thr 370 375 380 Arg Gly Arg Glu Leu Thr Ala Arg His Arg Asp Thr AlaPro Ala Pro 385 390 395 400 Leu Ala Ala Pro Gln Thr Ser Thr Pro Thr IleGly Arg Ser Arg 405 410 415 9 1458 DNA Streptomyces venezuelae 9atgaccgccc ccgccctttc cgccaccgcc ccggccgaac gctgcgcgca ccccggagcc 60gatctggggg cggcggtcca cgccgtcggc cagaccctcg ccgccggcgg cctcgtgccg 120cccgacgagg ccggaacgac cgcccgccac ctcgtccggc tcgccgtgcg ctacggcaac 180agccccttca ccccgctgga ggaggcccgc cacgacctgg gcgtcgaccg ggacgccttc 240cggcgcctcc tcgccctgtt cgggcaggtc ccggagctcc gcaccgcggt cgagaccggc 300cccgccgggg cgtactggaa gaacaccctg ctcccgctcg aacagcgcgg cgtcttcgac 360gcggcgctcg ccaggaagcc cgtcttcccg tacagcgtcg gcctctaccc cggcccgacc 420tgcatgttcc gctgccactt ctgcgtccgt gtgaccggcg cccgctacga cccgtccgcc 480ctcgacgccg gcaacgccat gttccggtcg gtcatcgacg agatacccgc gggcaacccc 540tcggcgatgt acttctccgg cggcctggag ccgctcacca accccggcct cgggagcctg 600gccgcgcacg ccaccgacca cggcctgcgg cccaccgtct acacgaactc cttcgcgctc 660accgagcgca ccctggagcg ccagcccggc ctctggggcc tgcacgccat ccgcacctcg 720ctctacggcc tcaacgacga ggagtacgag cagaccaccg gcaagaaggc cgccttccgc 780cgcgtccgcg agaacctgcg ccgcttccag cagctgcgcg ccgagcgcga gtcgccgatc 840aacctcggct tcgcctacat cgtgctcccg ggccgtgcct cccgcctgct cgacctggtc 900gacttcatcg ccgacctcaa cgacgccggg cagggcagga cgatcgactt cgtcaacatt 960cgcgaggact acagcggccg tgacgacggc aagctgccgc aggaggagcg ggccgagctc 1020caggaggccc tcaacgcctt cgaggagcgg gtccgcgagc gcacccccgg actccacatc 1080gactacggct acgccctgaa cagcctgcgc accggggccg acgccgaact gctgcggatc 1140aagcccgcca ccatgcggcc caccgcgcac ccgcaggtcg cggtgcaggt cgatctcctc 1200ggcgacgtgt acctgtaccg cgaggccggc ttccccgacc tggacggcgc gacccgctac 1260atcgcgggcc gcgtgacccc cgacacctcc ctcaccgagg tcgtcaggga cttcgtcgag 1320cgcggcggcg aggtggcggc cgtcgacggc gacgagtact tcatggacgg cttcgatcag 1380gtcgtcaccg cccgcctgaa ccagctggag cgcgacgccg cggacggctg ggaggaggcc 1440cgcggcttcc tgcgctga 1458 10 485 PRT Streptomyces venezuelae 10 Met ThrAla Pro Ala Leu Ser Ala Thr Ala Pro Ala Glu Arg Cys Ala 1 5 10 15 HisPro Gly Ala Asp Leu Gly Ala Ala Val His Ala Val Gly Gln Thr 20 25 30 LeuAla Ala Gly Gly Leu Val Pro Pro Asp Glu Ala Gly Thr Thr Ala 35 40 45 ArgHis Leu Val Arg Leu Ala Val Arg Tyr Gly Asn Ser Pro Phe Thr 50 55 60 ProLeu Glu Glu Ala Arg His Asp Leu Gly Val Asp Arg Asp Ala Phe 65 70 75 80Arg Arg Leu Leu Ala Leu Phe Gly Gln Val Pro Glu Leu Arg Thr Ala 85 90 95Val Glu Thr Gly Pro Ala Gly Ala Tyr Trp Lys Asn Thr Leu Leu Pro 100 105110 Leu Glu Gln Arg Gly Val Phe Asp Ala Ala Leu Ala Arg Lys Pro Val 115120 125 Phe Pro Tyr Ser Val Gly Leu Tyr Pro Gly Pro Thr Cys Met Phe Arg130 135 140 Cys His Phe Cys Val Arg Val Thr Gly Ala Arg Tyr Asp Pro SerAla 145 150 155 160 Leu Asp Ala Gly Asn Ala Met Phe Arg Ser Val Ile AspGlu Ile Pro 165 170 175 Ala Gly Asn Pro Ser Ala Met Tyr Phe Ser Gly GlyLeu Glu Pro Leu 180 185 190 Thr Asn Pro Gly Leu Gly Ser Leu Ala Ala HisAla Thr Asp His Gly 195 200 205 Leu Arg Pro Thr Val Tyr Thr Asn Ser PheAla Leu Thr Glu Arg Thr 210 215 220 Leu Glu Arg Gln Pro Gly Leu Trp GlyLeu His Ala Ile Arg Thr Ser 225 230 235 240 Leu Tyr Gly Leu Asn Asp GluGlu Tyr Glu Gln Thr Thr Gly Lys Lys 245 250 255 Ala Ala Phe Arg Arg ValArg Glu Asn Leu Arg Arg Phe Gln Gln Leu 260 265 270 Arg Ala Glu Arg GluSer Pro Ile Asn Leu Gly Phe Ala Tyr Ile Val 275 280 285 Leu Pro Gly ArgAla Ser Arg Leu Leu Asp Leu Val Asp Phe Ile Ala 290 295 300 Asp Leu AsnAsp Ala Gly Gln Gly Arg Thr Ile Asp Phe Val Asn Ile 305 310 315 320 ArgGlu Asp Tyr Ser Gly Arg Asp Asp Gly Lys Leu Pro Gln Glu Glu 325 330 335Arg Ala Glu Leu Gln Glu Ala Leu Asn Ala Phe Glu Glu Arg Val Arg 340 345350 Glu Arg Thr Pro Gly Leu His Ile Asp Tyr Gly Tyr Ala Leu Asn Ser 355360 365 Leu Arg Thr Gly Ala Asp Ala Glu Leu Leu Arg Ile Lys Pro Ala Thr370 375 380 Met Arg Pro Thr Ala His Pro Gln Val Ala Val Gln Val Asp LeuLeu 385 390 395 400 Gly Asp Val Tyr Leu Tyr Arg Glu Ala Gly Phe Pro AspLeu Asp Gly 405 410 415 Ala Thr Arg Tyr Ile Ala Gly Arg Val Thr Pro AspThr Ser Leu Thr 420 425 430 Glu Val Val Arg Asp Phe Val Glu Arg Gly GlyGlu Val Ala Ala Val 435 440 445 Asp Gly Asp Glu Tyr Phe Met Asp Gly PheAsp Gln Val Val Thr Ala 450 455 460 Arg Leu Asn Gln Leu Glu Arg Asp AlaAla Asp Gly Trp Glu Glu Ala 465 470 475 480 Arg Gly Phe Leu Arg 485 11879 DNA Streptomyces venezuelae 11 atgaagggaa tagtcctggc cggcgggagcggaactcggc tgcatccggc gacctcggtc 60 atttcgaagc agattcttcc ggtctacaacaaaccgatga tctactatcc gctgtcggtt 120 ctcatgctcg gcggtattcg cgagattcaaatcatctcga ccccccagca catcgaactc 180 ttccagtcgc ttctcggaaa cggcaggcacctgggaatag aactcgacta tgcggtccag 240 aaagagcccg caggaatcgc ggacgcacttctcgtcggag ccgagcacat cggcgacgac 300 acctgcgccc tgatcctggg cgacaacatcttccacgggc ccggcctcta cacgctcctg 360 cgggacagca tcgcgcgcct cgacggctgcgtgctcttcg gctacccggt caaggacccc 420 gagcggtacg gcgtcgccga ggtggacgcgacgggccggc tgaccgacct cgtcgagaag 480 cccgtcaagc cgcgctccaa cctcgccgtcaccggcctct acctctacga caacgacgtc 540 gtcgacatcg ccaagaacat ccggccctcgccgcgcggcg agctggagat caccgacgtc 600 aaccgcgtct acctggagcg gggccgggccgaactcgtca acctgggccg cggcttcgcc 660 tggctggaca ccggcaccca cgactcgctcctgcgggccg cccagtacgt ccaggtcctg 720 gaggagcggc agggcgtctg gatcgcgggccttgaggaga tcgccttccg catgggcttc 780 atcgacgccg aggcctgtca cggcctgggagaaggcctct cccgcaccga gtacggcagc 840 tatctgatgg agatcgccgg ccgcgagggagccccgtga 879 12 292 PRT Streptomyces venezuelae 12 Met Lys Gly Ile ValLeu Ala Gly Gly Ser Gly Thr Arg Leu His Pro 1 5 10 15 Ala Thr Ser ValIle Ser Lys Gln Ile Leu Pro Val Tyr Asn Lys Pro 20 25 30 Met Ile Tyr TyrPro Leu Ser Val Leu Met Leu Gly Gly Ile Arg Glu 35 40 45 Ile Gln Ile IleSer Thr Pro Gln His Ile Glu Leu Phe Gln Ser Leu 50 55 60 Leu Gly Asn GlyArg His Leu Gly Ile Glu Leu Asp Tyr Ala Val Gln 65 70 75 80 Lys Glu ProAla Gly Ile Ala Asp Ala Leu Leu Val Gly Ala Glu His 85 90 95 Ile Gly AspAsp Thr Cys Ala Leu Ile Leu Gly Asp Asn Ile Phe His 100 105 110 Gly ProGly Leu Tyr Thr Leu Leu Arg Asp Ser Ile Ala Arg Leu Asp 115 120 125 GlyCys Val Leu Phe Gly Tyr Pro Val Lys Asp Pro Glu Arg Tyr Gly 130 135 140Val Ala Glu Val Asp Ala Thr Gly Arg Leu Thr Asp Leu Val Glu Lys 145 150155 160 Pro Val Lys Pro Arg Ser Asn Leu Ala Val Thr Gly Leu Tyr Leu Tyr165 170 175 Asp Asn Asp Val Val Asp Ile Ala Lys Asn Ile Arg Pro Ser ProArg 180 185 190 Gly Glu Leu Glu Ile Thr Asp Val Asn Arg Val Tyr Leu GluArg Gly 195 200 205 Arg Ala Glu Leu Val Asn Leu Gly Arg Gly Phe Ala TrpLeu Asp Thr 210 215 220 Gly Thr His Asp Ser Leu Leu Arg Ala Ala Gln TyrVal Gln Val Leu 225 230 235 240 Glu Glu Arg Gln Gly Val Trp Ile Ala GlyLeu Glu Glu Ile Ala Phe 245 250 255 Arg Met Gly Phe Ile Asp Ala Glu AlaCys His Gly Leu Gly Glu Gly 260 265 270 Leu Ser Arg Thr Glu Tyr Gly SerTyr Leu Met Glu Ile Ala Gly Arg 275 280 285 Glu Gly Ala Pro 290 13 1014DNA Streptomyces venezuelae 13 gtgcggcttc tggtgaccgg aggtgcgggcttcatcggct cgcacttcgt gcggcagctc 60 ctcgccgggg cgtaccccga cgtgcccgccgatgaggtga tcgtcctgga cagcctcacc 120 tacgcgggca accgcgccaa cctcgccccggtggacgcgg acccgcgact gcgcttcgtc 180 cacggcgaca tccgcgacgc cggcctcctcgcccgggaac tgcgcggcgt ggacgccatc 240 gtccacttcg cggccgagag ccacgtggaccgctccatcg cgggcgcgtc cgtgttcacc 300 gagaccaacg tgcagggcac gcagacgctgctccagtgcg ccgtcgacgc cggcgtcggc 360 cgggtcgtgc acgtctccac cgacgaggtgtacgggtcga tcgactccgg ctcctggacc 420 gagagcagcc cgctggagcc caactcgccctacgcggcgt ccaaggccgg ctccgacctc 480 gttgcccgcg cctaccaccg gacgtacggcctcgacgtac ggatcacccg ctgctgcaac 540 aactacgggc cgtaccagca ccccgagaagctcatccccc tcttcgtgac gaacctcctc 600 gacggcggga cgctcccgct gtacggcgacggcgcgaacg tccgcgagtg ggtgcacacc 660 gacgaccact gccggggcat cgcgctcgtcctcgcgggcg gccgggccgg cgagatctac 720 cacatcggcg gcggcctgga gctgaccaaccgcgaactca ccggcatcct cctggactcg 780 ctcggcgccg actggtcctc ggtccggaaggtcgccgacc gcaagggcca cgacctgcgc 840 tactccctcg acggcggcga gatcgagcgcgagctcggct accgcccgca ggtctccttc 900 gcggacggcc tcgcgcggac cgtccgctggtaccgggaga accgcggctg gtgggagccg 960 ctcaaggcga ccgccccgca gctgcccgccaccgccgtgg aggtgtccgc gtga 1014 14 337 PRT Streptomyces venezuelae 14Met Arg Leu Leu Val Thr Gly Gly Ala Gly Phe Ile Gly Ser His Phe 1 5 1015 Val Arg Gln Leu Leu Ala Gly Ala Tyr Pro Asp Val Pro Ala Asp Glu 20 2530 Val Ile Val Leu Asp Ser Leu Thr Tyr Ala Gly Asn Arg Ala Asn Leu 35 4045 Ala Pro Val Asp Ala Asp Pro Arg Leu Arg Phe Val His Gly Asp Ile 50 5560 Arg Asp Ala Gly Leu Leu Ala Arg Glu Leu Arg Gly Val Asp Ala Ile 65 7075 80 Val His Phe Ala Ala Glu Ser His Val Asp Arg Ser Ile Ala Gly Ala 8590 95 Ser Val Phe Thr Glu Thr Asn Val Gln Gly Thr Gln Thr Leu Leu Gln100 105 110 Cys Ala Val Asp Ala Gly Val Gly Arg Val Val His Val Ser ThrAsp 115 120 125 Glu Val Tyr Gly Ser Ile Asp Ser Gly Ser Trp Thr Glu SerSer Pro 130 135 140 Leu Glu Pro Asn Ser Pro Tyr Ala Ala Ser Lys Ala GlySer Asp Leu 145 150 155 160 Val Ala Arg Ala Tyr His Arg Thr Tyr Gly LeuAsp Val Arg Ile Thr 165 170 175 Arg Cys Cys Asn Asn Tyr Gly Pro Tyr GlnHis Pro Glu Lys Leu Ile 180 185 190 Pro Leu Phe Val Thr Asn Leu Leu AspGly Gly Thr Leu Pro Leu Tyr 195 200 205 Gly Asp Gly Ala Asn Val Arg GluTrp Val His Thr Asp Asp His Cys 210 215 220 Arg Gly Ile Ala Leu Val LeuAla Gly Gly Arg Ala Gly Glu Ile Tyr 225 230 235 240 His Ile Gly Gly GlyLeu Glu Leu Thr Asn Arg Glu Leu Thr Gly Ile 245 250 255 Leu Leu Asp SerLeu Gly Ala Asp Trp Ser Ser Val Arg Lys Val Ala 260 265 270 Asp Arg LysGly His Asp Leu Arg Tyr Ser Leu Asp Gly Gly Glu Ile 275 280 285 Glu ArgGlu Leu Gly Tyr Arg Pro Gln Val Ser Phe Ala Asp Gly Leu 290 295 300 AlaArg Thr Val Arg Trp Tyr Arg Glu Asn Arg Gly Trp Trp Glu Pro 305 310 315320 Leu Lys Ala Thr Ala Pro Gln Leu Pro Ala Thr Ala Val Glu Val Ser 325330 335 Ala 15 1140 DNA Streptomyces venezuelae 15 gtgagcagcc gcgccgagaccccccgcgtc cccttcctcg acctcaaggc cgcctacgag 60 gagctccgcg cggagaccgacgccgcgatc gcccgcgtcc tcgactcggg gcgctacctc 120 ctcggacccg aactcgaaggattcgaggcg gagttcgccg cgtactgcga gacggaccac 180 gccgtcggcg tgaacagcgggatggacgcc ctccagctcg ccctccgcgg cctcggcatc 240 ggacccgggg acgaggtgatcgtcccctcg cacacgtaca tcgccagctg gctcgcggtg 300 tccgccaccg gcgcgacccccgtgcccgtc gagccgcacg aggaccaccc caccctggac 360 ccgctgctcg tcgagaaggcgatcaccccc cgcacccggg cgctcctccc cgtccacctc 420 tacgggcacc ccgccgacatggacgccctc cgcgagctcg cggaccggca cggcctgcac 480 atcgtcgagg acgccgcgcaggcccacggc gcccgctacc ggggccggcg gatcggcgcc 540 gggtcgtcgg tggccgcgttcagcttctac ccgggcaaga acctcggctg cttcggcgac 600 ggcggcgccg tcgtcaccggcgaccccgag ctcgccgaac ggctccggat gctccgcaac 660 tacggctcgc ggcagaagtacagccacgag acgaagggca ccaactcccg cctggacgag 720 atgcaggccg ccgtgctgcggatccggctc gcccacctgg acagctggaa cggccgcagg 780 tcggcgctgg ccgcggagtacctctccggg ctcgccggac tgcccggcat cggcctgccg 840 gtgaccgcgc ccgacaccgacccggtctgg cacctcttca ccgtgcgcac cgagcgccgc 900 gacgagctgc gcagccacctcgacgcccgc ggcatcgaca ccctcacgca ctacccggta 960 cccgtgcacc tctcgcccgcctacgcgggc gaggcaccgc cggaaggctc gctcccgcgg 1020 gccgagagct tcgcgcggcaggtcctcagc ctgccgatcg gcccgcacct ggagcgcccg 1080 caggcgctgc gggtgatcgacgccgtgcgc gaatgggccg agcgggtcga ccaggcctag 1140 16 379 PRT Streptomycesvenezuelae 16 Met Ser Ser Arg Ala Glu Thr Pro Arg Val Pro Phe Leu AspLeu Lys 1 5 10 15 Ala Ala Tyr Glu Glu Leu Arg Ala Glu Thr Asp Ala AlaIle Ala Arg 20 25 30 Val Leu Asp Ser Gly Arg Tyr Leu Leu Gly Pro Glu LeuGlu Gly Phe 35 40 45 Glu Ala Glu Phe Ala Ala Tyr Cys Glu Thr Asp His AlaVal Gly Val 50 55 60 Asn Ser Gly Met Asp Ala Leu Gln Leu Ala Leu Arg GlyLeu Gly Ile 65 70 75 80 Gly Pro Gly Asp Glu Val Ile Val Pro Ser His ThrTyr Ile Ala Ser 85 90 95 Trp Leu Ala Val Ser Ala Thr Gly Ala Thr Pro ValPro Val Glu Pro 100 105 110 His Glu Asp His Pro Thr Leu Asp Pro Leu LeuVal Glu Lys Ala Ile 115 120 125 Thr Pro Arg Thr Arg Ala Leu Leu Pro ValHis Leu Tyr Gly His Pro 130 135 140 Ala Asp Met Asp Ala Leu Arg Glu LeuAla Asp Arg His Gly Leu His 145 150 155 160 Ile Val Glu Asp Ala Ala GlnAla His Gly Ala Arg Tyr Arg Gly Arg 165 170 175 Arg Ile Gly Ala Gly SerSer Val Ala Ala Phe Ser Phe Tyr Pro Gly 180 185 190 Lys Asn Leu Gly CysPhe Gly Asp Gly Gly Ala Val Val Thr Gly Asp 195 200 205 Pro Glu Leu AlaGlu Arg Leu Arg Met Leu Arg Asn Tyr Gly Ser Arg 210 215 220 Gln Lys TyrSer His Glu Thr Lys Gly Thr Asn Ser Arg Leu Asp Glu 225 230 235 240 MetGln Ala Ala Val Leu Arg Ile Arg Leu Ala His Leu Asp Ser Trp 245 250 255Asn Gly Arg Arg Ser Ala Leu Ala Ala Glu Tyr Leu Ser Gly Leu Ala 260 265270 Gly Leu Pro Gly Ile Gly Leu Pro Val Thr Ala Pro Asp Thr Asp Pro 275280 285 Val Trp His Leu Phe Thr Val Arg Thr Glu Arg Arg Asp Glu Leu Arg290 295 300 Ser His Leu Asp Ala Arg Gly Ile Asp Thr Leu Thr His Tyr ProVal 305 310 315 320 Pro Val His Leu Ser Pro Ala Tyr Ala Gly Glu Ala ProPro Glu Gly 325 330 335 Ser Leu Pro Arg Ala Glu Ser Phe Ala Arg Gln ValLeu Ser Leu Pro 340 345 350 Ile Gly Pro His Leu Glu Arg Pro Gln Ala LeuArg Val Ile Asp Ala 355 360 365 Val Arg Glu Trp Ala Glu Arg Val Asp GlnAla 370 375 17 714 DNA Streptomyces venezuelae 17 gtgtacgaag tcgaccacgccgacgtctac gacctcttct acctgggtcg cggcaaggac 60 tacgccgccg aggcctccgacatcgccgac ctggtgcgct cccgtacccc cgaggcctcc 120 tcgctcctgg acgtggcctgcggtacgggc acgcatctgg agcacttcac caaggagttc 180 ggcgacaccg ccggcctggagctgtccgag gacatgctca cccacgcccg caagcggctg 240 cccgacgcca cgctccaccagggcgacatg cgggacttcc ggctcggccg gaagttctcc 300 gccgtggtca gcatgttcagctccgtcggc tacctgaaga cgaccgagga actcggcgcg 360 gccgtcgcct cgttcgcggagcacctggag cccggtggcg tcgtcgtcgt cgagccgtgg 420 tggttcccgg agaccttcgccgacggctgg gtcagcgccg acgtcgtccg ccgtgacggg 480 cgcaccgtgg cccgtgtctcgcactcggtg cgggagggga acgcgacgcg catggaggtc 540 cacttcaccg tggccgacccgggcaagggc gtgcggcact tctccgacgt ccatctcatc 600 accctgttcc accaggccgagtacgaggcc gcgttcacgg ccgccgggct gcgcgtcgag 660 tacctggagg gcggcccgtcgggccgtggc ctcttcgtcg gcgtccccgc ctga 714 18 237 PRT Streptomycesvenezuelae 18 Met Tyr Glu Val Asp His Ala Asp Val Tyr Asp Leu Phe TyrLeu Gly 1 5 10 15 Arg Gly Lys Asp Tyr Ala Ala Glu Ala Ser Asp Ile AlaAsp Leu Val 20 25 30 Arg Ser Arg Thr Pro Glu Ala Ser Ser Leu Leu Asp ValAla Cys Gly 35 40 45 Thr Gly Thr His Leu Glu His Phe Thr Lys Glu Phe GlyAsp Thr Ala 50 55 60 Gly Leu Glu Leu Ser Glu Asp Met Leu Thr His Ala ArgLys Arg Leu 65 70 75 80 Pro Asp Ala Thr Leu His Gln Gly Asp Met Arg AspPhe Arg Leu Gly 85 90 95 Arg Lys Phe Ser Ala Val Val Ser Met Phe Ser SerVal Gly Tyr Leu 100 105 110 Lys Thr Thr Glu Glu Leu Gly Ala Ala Val AlaSer Phe Ala Glu His 115 120 125 Leu Glu Pro Gly Gly Val Val Val Val GluPro Trp Trp Phe Pro Glu 130 135 140 Thr Phe Ala Asp Gly Trp Val Ser AlaAsp Val Val Arg Arg Asp Gly 145 150 155 160 Arg Thr Val Ala Arg Val SerHis Ser Val Arg Glu Gly Asn Ala Thr 165 170 175 Arg Met Glu Val His PheThr Val Ala Asp Pro Gly Lys Gly Val Arg 180 185 190 His Phe Ser Asp ValHis Leu Ile Thr Leu Phe His Gln Ala Glu Tyr 195 200 205 Glu Ala Ala PheThr Ala Ala Gly Leu Arg Val Glu Tyr Leu Glu Gly 210 215 220 Gly Pro SerGly Arg Gly Leu Phe Val Gly Val Pro Ala 225 230 235 19 1281 DNAStreptomyces venezuelae 19 atgcgcgtcc tgctgacctc gttcgcacat cacacgcactactacggcct ggtgcccctg 60 gcctgggcgc tgctcgccgc cgggcacgag gtgcgggtcgccagccagcc cgcgctcacg 120 gacaccatca ccgggtccgg gctcgccgcg gtgccggtcggcaccgacca cctcatccac 180 gagtaccggg tgcggatggc gggcgagccg cgcccgaaccatccggcgat cgccttcgac 240 gaggcccgtc ccgagccgct ggactgggac cacgccctcggcatcgaggc gatcctcgcc 300 ccgtacttcc atctgctcgc caacaacgac tcgatggtcgacgacctcgt cgacttcgcc 360 cggtcctggc agccggacct ggtgctgtgg gagccgacgacctacgcggg cgccgtcgcc 420 gcccaggtca ccggtgccgc gcacgcccgg gtcctgtgggggcccgacgt gatgggcagc 480 gcccgccgca agttcgtcgc gctgcgggac cggcagccgcccgagcaccg cgaggacccc 540 accgcggagt ggctgacgtg gacgctcgac cggtacggcgcctccttcga agaggagctg 600 ctcaccggcc agttcacgat cgacccgacc ccgccgagcctgcgcctcga cacgggcctg 660 ccgaccgtcg ggatgcgtta tgttccgtac aacggcacgtcggtcgtgcc ggactggctg 720 agtgagccgc ccgcgcggcc ccgggtctgc ctgaccctcggcgtctccgc gcgtgaggtc 780 ctcggcggcg acggcgtctc gcagggcgac atcctggaggcgctcgccga cctcgacatc 840 gagctcgtcg ccacgctcga cgcgagtcag cgcgccgagatccgcaacta cccgaagcac 900 acccggttca cggacttcgt gccgatgcac gcgctcctgccgagctgctc ggcgatcatc 960 caccacggcg gggcgggcac ctacgcgacc gccgtgatcaacgcggtgcc gcaggtcatg 1020 ctcgccgagc tgtgggacgc gccggtcaag gcgcgggccgtcgccgagca gggggcgggg 1080 ttcttcctgc cgccggccga gctcacgccg caggccgtgcgggacgccgt cgtccgcatc 1140 ctcgacgacc cctcggtcgc caccgccgcg caccggctgcgcgaggagac cttcggcgac 1200 cccaccccgg ccgggatcgt ccccgagctg gagcggctcgccgcgcagca ccgccgcccg 1260 ccggccgacg cccggcactg a 1281 20 426 PRTStreptomyces venezuelae 20 Met Arg Val Leu Leu Thr Ser Phe Ala His HisThr His Tyr Tyr Gly 1 5 10 15 Leu Val Pro Leu Ala Trp Ala Leu Leu AlaAla Gly His Glu Val Arg 20 25 30 Val Ala Ser Gln Pro Ala Leu Thr Asp ThrIle Thr Gly Ser Gly Leu 35 40 45 Ala Ala Val Pro Val Gly Thr Asp His LeuIle His Glu Tyr Arg Val 50 55 60 Arg Met Ala Gly Glu Pro Arg Pro Asn HisPro Ala Ile Ala Phe Asp 65 70 75 80 Glu Ala Arg Pro Glu Pro Leu Asp TrpAsp His Ala Leu Gly Ile Glu 85 90 95 Ala Ile Leu Ala Pro Tyr Phe His LeuLeu Ala Asn Asn Asp Ser Met 100 105 110 Val Asp Asp Leu Val Asp Phe AlaArg Ser Trp Gln Pro Asp Leu Val 115 120 125 Leu Trp Glu Pro Thr Thr TyrAla Gly Ala Val Ala Ala Gln Val Thr 130 135 140 Gly Ala Ala His Ala ArgVal Leu Trp Gly Pro Asp Val Met Gly Ser 145 150 155 160 Ala Arg Arg LysPhe Val Ala Leu Arg Asp Arg Gln Pro Pro Glu His 165 170 175 Arg Glu AspPro Thr Ala Glu Trp Leu Thr Trp Thr Leu Asp Arg Tyr 180 185 190 Gly AlaSer Phe Glu Glu Glu Leu Leu Thr Gly Gln Phe Thr Ile Asp 195 200 205 ProThr Pro Pro Ser Leu Arg Leu Asp Thr Gly Leu Pro Thr Val Gly 210 215 220Met Arg Tyr Val Pro Tyr Asn Gly Thr Ser Val Val Pro Asp Trp Leu 225 230235 240 Ser Glu Pro Pro Ala Arg Pro Arg Val Cys Leu Thr Leu Gly Val Ser245 250 255 Ala Arg Glu Val Leu Gly Gly Asp Gly Val Ser Gln Gly Asp IleLeu 260 265 270 Glu Ala Leu Ala Asp Leu Asp Ile Glu Leu Val Ala Thr LeuAsp Ala 275 280 285 Ser Gln Arg Ala Glu Ile Arg Asn Tyr Pro Lys His ThrArg Phe Thr 290 295 300 Asp Phe Val Pro Met His Ala Leu Leu Pro Ser CysSer Ala Ile Ile 305 310 315 320 His His Gly Gly Ala Gly Thr Tyr Ala ThrAla Val Ile Asn Ala Val 325 330 335 Pro Gln Val Met Leu Ala Glu Leu TrpAsp Ala Pro Val Lys Ala Arg 340 345 350 Ala Val Ala Glu Gln Gly Ala GlyPhe Phe Leu Pro Pro Ala Glu Leu 355 360 365 Thr Pro Gln Ala Val Arg AspAla Val Val Arg Ile Leu Asp Asp Pro 370 375 380 Ser Val Ala Thr Ala AlaHis Arg Leu Arg Glu Glu Thr Phe Gly Asp 385 390 395 400 Pro Thr Pro AlaGly Ile Val Pro Glu Leu Glu Arg Leu Ala Ala Gln 405 410 415 His Arg ArgPro Pro Ala Asp Ala Arg His 420 425 21 1209 DNA Streptomyces venezuelae21 gtgaccgacg acctgacggg ggccctcacg cagcccccgc tgggccgcac cgtccgcgcg 60gtggccgacc gtgaactcgg cacccacctc ctggagaccc gcggcatcca ctggatccac 120gccgcgaacg gcgacccgta cgccaccgtg ctgcgcggcc aggcggacga cccgtatccc 180gcgtacgagc gggtgcgtgc ccgcggcgcg ctctccttca gcccgacggg cagctgggtc 240accgccgatc acgccctggc ggcgagcatc ctctgctcga cggacttcgg ggtctccggc 300gccgacggcg tcccggtgcc gcagcaggtc ctctcgtacg gggagggctg tccgctggag 360cgcgagcagg tgctgccggc ggccggtgac gtgccggagg gcgggcagcg tgccgtggtc 420gaggggatcc accgggagac gctggagggt ctcgcgccgg acccgtcggc gtcgtacgcc 480ttcgagctgc tgggcggttt cgtccgcccg gcggtgacgg ccgctgccgc cgccgtgctg 540ggtgttcccg cggaccggcg cgcggacttc gcggatctgc tggagcggct ccggccgctg 600tccgacagcc tgctggcccc gcagtccctg cggacggtac gggcggcgga cggcgcgctg 660gccgagctca cggcgctgct cgccgattcg gacgactccc ccggggccct gctgtcggcg 720ctcggggtca ccgcagccgt ccagctcacc gggaacgcgg tgctcgcgct cctcgcgcat 780cccgagcagt ggcgggagct gtgcgaccgg cccgggctcg cggcggccgc ggtggaggag 840accctccgct acgacccgcc ggtgcagctc gacgcccggg tggtccgcgg ggagacggag 900ctggcgggcc ggcggctgcc ggccggggcg catgtcgtcg tcctgaccgc cgcgaccggc 960cgggacccgg aggtcttcac ggacccggag cgcttcgacc tcgcgcgccc cgacgccgcc 1020gcgcacctcg cgctgcaccc cgccggtccg tacggcccgg tggcgtccct ggtccggctt 1080caggcggagg tcgcgctgcg gaccctggcc gggcgtttcc ccgggctgcg gcaggcgggg 1140gacgtgctcc gcccccgccg cgcgcctgtc ggccgcgggc cgctgagcgt cccggtcagc 1200agctcctga 1209 22 402 PRT Streptomyces venezuelae 22 Met Thr Asp Asp LeuThr Gly Ala Leu Thr Gln Pro Pro Leu Gly Arg 1 5 10 15 Thr Val Arg AlaVal Ala Asp Arg Glu Leu Gly Thr His Leu Leu Glu 20 25 30 Thr Arg Gly IleHis Trp Ile His Ala Ala Asn Gly Asp Pro Tyr Ala 35 40 45 Thr Val Leu ArgGly Gln Ala Asp Asp Pro Tyr Pro Ala Tyr Glu Arg 50 55 60 Val Arg Ala ArgGly Ala Leu Ser Phe Ser Pro Thr Gly Ser Trp Val 65 70 75 80 Thr Ala AspHis Ala Leu Ala Ala Ser Ile Leu Cys Ser Thr Asp Phe 85 90 95 Gly Val SerGly Ala Asp Gly Val Pro Val Pro Gln Gln Val Leu Ser 100 105 110 Tyr GlyGlu Gly Cys Pro Leu Glu Arg Glu Gln Val Leu Pro Ala Ala 115 120 125 GlyAsp Val Pro Glu Gly Gly Gln Arg Ala Val Val Glu Gly Ile His 130 135 140Arg Glu Thr Leu Glu Gly Leu Ala Pro Asp Pro Ser Ala Ser Tyr Ala 145 150155 160 Phe Glu Leu Leu Gly Gly Phe Val Arg Pro Ala Val Thr Ala Ala Ala165 170 175 Ala Ala Val Leu Gly Val Pro Ala Asp Arg Arg Ala Asp Phe AlaAsp 180 185 190 Leu Leu Glu Arg Leu Arg Pro Leu Ser Asp Ser Leu Leu AlaPro Gln 195 200 205 Ser Leu Arg Thr Val Arg Ala Ala Asp Gly Ala Leu AlaGlu Leu Thr 210 215 220 Ala Leu Leu Ala Asp Ser Asp Asp Ser Pro Gly AlaLeu Leu Ser Ala 225 230 235 240 Leu Gly Val Thr Ala Ala Val Gln Leu ThrGly Asn Ala Val Leu Ala 245 250 255 Leu Leu Ala His Pro Glu Gln Trp ArgGlu Leu Cys Asp Arg Pro Gly 260 265 270 Leu Ala Ala Ala Ala Val Glu GluThr Leu Arg Tyr Asp Pro Pro Val 275 280 285 Gln Leu Asp Ala Arg Val ValArg Gly Glu Thr Glu Leu Ala Gly Arg 290 295 300 Arg Leu Pro Ala Gly AlaHis Val Val Val Leu Thr Ala Ala Thr Gly 305 310 315 320 Arg Asp Pro GluVal Phe Thr Asp Pro Glu Arg Phe Asp Leu Ala Arg 325 330 335 Pro Asp AlaAla Ala His Leu Ala Leu His Pro Ala Gly Pro Tyr Gly 340 345 350 Pro ValAla Ser Leu Val Arg Leu Gln Ala Glu Val Ala Leu Arg Thr 355 360 365 LeuAla Gly Arg Phe Pro Gly Leu Arg Gln Ala Gly Asp Val Leu Arg 370 375 380Pro Arg Arg Ala Pro Val Gly Arg Gly Pro Leu Ser Val Pro Val Ser 385 390395 400 Ser Ser 23 2430 DNA Streptomyces venezuelae 23 gtgacaggtaagacccgaat accgcgtgtc cgccgcggcc gcaccacgcc cagggccttc 60 accctggccgtcgtcggcac cctgctggcg ggcaccaccg tggcggccgc cgctcccggc 120 gccgccgacacggccaatgt tcagtacacg agccgggcgg cggagctcgt cgcccagatg 180 acgctcgacgagaagatcag cttcgtccac tgggcgctgg accccgaccg gcagaacgtc 240 ggctaccttcccggcgtgcc gcgtctgggc atcccggagc tgcgtgccgc cgacggcccg 300 aacggcatccgcctggtggg gcagaccgcc accgcgctgc ccgcgccggt cgccctggcc 360 agcaccttcgacgacaccat ggccgacagc tacggcaagg tcatgggccg cgacggtcgc 420 gcgctcaaccaggacatggt cctgggcccg atgatgaaca acatccgggt gccgcacggc 480 ggccggaactacgagacctt cagcgaggac cccctggtct cctcgcgcac cgcggtcgcc 540 cagatcaagggcatccaggg tgcgggtctg atgaccacgg ccaagcactt cgcggccaac 600 aaccaggagaacaaccgctt ctccgtgaac gccaatgtcg acgagcagac gctccgcgag 660 atcgagttcccggcgttcga ggcgtcctcc aaggccggcg cggcctcctt catgtgtgcc 720 tacaacggcctcaacgggaa gccgtcctgc ggcaacgacg agctcctcaa caacgtgctg 780 cgcacgcagtggggcttcca gggctgggtg atgtccgact ggctcgccac cccgggcacc 840 gacgccatcaccaagggcct cgaccaggag atgggcgtcg agctccccgg cgacgtcccg 900 aagggcgagccctcgccgcc ggccaagttc ttcggcgagg cgctgaagac ggccgtcctg 960 aacggcacggtccccgaggc ggccgtgacg cggtcggcgg agcggatcgt cggccagatg 1020 gagaagttcggtctgctcct cgccactccg gcgccgcggc ccgagcgcga caaggcgggt 1080 gcccaggcggtgtcccgcaa ggtcgccgag aacggcgcgg tgctcctgcg caacgagggc 1140 caggccctgccgctcgccgg tgacgccggc aagagcatcg cggtcatcgg cccgacggcc 1200 gtcgaccccaaggtcaccgg cctgggcagc gcccacgtcg tcccggactc ggcggcggcg 1260 ccactcgacaccatcaaggc ccgcgcgggt gcgggtgcga cggtgacgta cgagacgggt 1320 gaggagaccttcgggacgca gatcccggcg gggaacctca gcccggcgtt caaccagggc 1380 caccagctcgagccgggcaa ggcgggggcg ctgtacgacg gcacgctgac cgtgcccgcc 1440 gacggcgagtaccgcatcgc ggtccgtgcc accggtggtt acgccacggt gcagctcggc 1500 agccacaccatcgaggccgg tcaggtctac ggcaaggtga gcagcccgct cctcaagctg 1560 accaagggcacgcacaagct cacgatctcg ggcttcgcga tgagtgccac cccgctctcc 1620 ctggagctgggctgggtgac gccggcggcg gccgacgcga cgatcgcgaa ggccgtggag 1680 tcggcgcggaaggcccgtac ggcggtcgtc ttcgcctacg acgacggcac cgagggcgtc 1740 gaccgtccgaacctgtcgct gccgggtacg caggacaagc tgatctcggc tgtcgcggac 1800 gccaacccgaacacgatcgt ggtcctcaac accggttcgt cggtgctgat gccgtggctg 1860 tccaagacccgcgcggtcct ggacatgtgg tacccgggcc aggcgggcgc cgaggccacc 1920 gccgcgctgctctacggtga cgtcaacccg agcggcaagc tcacgcagag cttcccggcc 1980 gccgagaaccagcacgcggt cgccggcgac ccgacaagct acccgggcgt cgacaaccag 2040 cagacgtaccgcgagggcat ccacgtcggg taccgctggt tcgacaagga gaacgtcaag 2100 ccgctgttcccgttcgggca cggcctgtcg tacacctcgt tcacgcagag cgccccgacc 2160 gtcgtgcgtacgtccacggg tggtctgaag gtcacggtca cggtccgcaa cagcgggaag 2220 cgcgccggccaggaggtcgt ccaggcgtac ctcggtgcca gcccgaacgt gacggctccg 2280 caggcgaagaagaagctcgt gggctacacg aaggtctcgc tcgccgcggg cgaggcgaag 2340 acggtgacggtgaacgtcga ccgccgtcag ctgcagaccg gttcgtcctc cgccgacctg 2400 cggggcagcgccacggtcaa cgtctggtga 2430 24 809 PRT Streptomyces venezuelae 24 Met ThrGly Lys Thr Arg Ile Pro Arg Val Arg Arg Gly Arg Thr Thr 1 5 10 15 ProArg Ala Phe Thr Leu Ala Val Val Gly Thr Leu Leu Ala Gly Thr 20 25 30 ThrVal Ala Ala Ala Ala Pro Gly Ala Ala Asp Thr Ala Asn Val Gln 35 40 45 TyrThr Ser Arg Ala Ala Glu Leu Val Ala Gln Met Thr Leu Asp Glu 50 55 60 LysIle Ser Phe Val His Trp Ala Leu Asp Pro Asp Arg Gln Asn Val 65 70 75 80Gly Tyr Leu Pro Gly Val Pro Arg Leu Gly Ile Pro Glu Leu Arg Ala 85 90 95Ala Asp Gly Pro Asn Gly Ile Arg Leu Val Gly Gln Thr Ala Thr Ala 100 105110 Leu Pro Ala Pro Val Ala Leu Ala Ser Thr Phe Asp Asp Thr Met Ala 115120 125 Asp Ser Tyr Gly Lys Val Met Gly Arg Asp Gly Arg Ala Leu Asn Gln130 135 140 Asp Met Val Leu Gly Pro Met Met Asn Asn Ile Arg Val Pro HisGly 145 150 155 160 Gly Arg Asn Tyr Glu Thr Phe Ser Glu Asp Pro Leu ValSer Ser Arg 165 170 175 Thr Ala Val Ala Gln Ile Lys Gly Ile Gln Gly AlaGly Leu Met Thr 180 185 190 Thr Ala Lys His Phe Ala Ala Asn Asn Gln GluAsn Asn Arg Phe Ser 195 200 205 Val Asn Ala Asn Val Asp Glu Gln Thr LeuArg Glu Ile Glu Phe Pro 210 215 220 Ala Phe Glu Ala Ser Ser Lys Ala GlyAla Ala Ser Phe Met Cys Ala 225 230 235 240 Tyr Asn Gly Leu Asn Gly LysPro Ser Cys Gly Asn Asp Glu Leu Leu 245 250 255 Asn Asn Val Leu Arg ThrGln Trp Gly Phe Gln Gly Trp Val Met Ser 260 265 270 Asp Trp Leu Ala ThrPro Gly Thr Asp Ala Ile Thr Lys Gly Leu Asp 275 280 285 Gln Glu Met GlyVal Glu Leu Pro Gly Asp Val Pro Lys Gly Glu Pro 290 295 300 Ser Pro ProAla Lys Phe Phe Gly Glu Ala Leu Lys Thr Ala Val Leu 305 310 315 320 AsnGly Thr Val Pro Glu Ala Ala Val Thr Arg Ser Ala Glu Arg Ile 325 330 335Val Gly Gln Met Glu Lys Phe Gly Leu Leu Leu Ala Thr Pro Ala Pro 340 345350 Arg Pro Glu Arg Asp Lys Ala Gly Ala Gln Ala Val Ser Arg Lys Val 355360 365 Ala Glu Asn Gly Ala Val Leu Leu Arg Asn Glu Gly Gln Ala Leu Pro370 375 380 Leu Ala Gly Asp Ala Gly Lys Ser Ile Ala Val Ile Gly Pro ThrAla 385 390 395 400 Val Asp Pro Lys Val Thr Gly Leu Gly Ser Ala His ValVal Pro Asp 405 410 415 Ser Ala Ala Ala Pro Leu Asp Thr Ile Lys Ala ArgAla Gly Ala Gly 420 425 430 Ala Thr Val Thr Tyr Glu Thr Gly Glu Glu ThrPhe Gly Thr Gln Ile 435 440 445 Pro Ala Gly Asn Leu Ser Pro Ala Phe AsnGln Gly His Gln Leu Glu 450 455 460 Pro Gly Lys Ala Gly Ala Leu Tyr AspGly Thr Leu Thr Val Pro Ala 465 470 475 480 Asp Gly Glu Tyr Arg Ile AlaVal Arg Ala Thr Gly Gly Tyr Ala Thr 485 490 495 Val Gln Leu Gly Ser HisThr Ile Glu Ala Gly Gln Val Tyr Gly Lys 500 505 510 Val Ser Ser Pro LeuLeu Lys Leu Thr Lys Gly Thr His Lys Leu Thr 515 520 525 Ile Ser Gly PheAla Met Ser Ala Thr Pro Leu Ser Leu Glu Leu Gly 530 535 540 Trp Val ThrPro Ala Ala Ala Asp Ala Thr Ile Ala Lys Ala Val Glu 545 550 555 560 SerAla Arg Lys Ala Arg Thr Ala Val Val Phe Ala Tyr Asp Asp Gly 565 570 575Thr Glu Gly Val Asp Arg Pro Asn Leu Ser Leu Pro Gly Thr Gln Asp 580 585590 Lys Leu Ile Ser Ala Val Ala Asp Ala Asn Pro Asn Thr Ile Val Val 595600 605 Leu Asn Thr Gly Ser Ser Val Leu Met Pro Trp Leu Ser Lys Thr Arg610 615 620 Ala Val Leu Asp Met Trp Tyr Pro Gly Gln Ala Gly Ala Glu AlaThr 625 630 635 640 Ala Ala Leu Leu Tyr Gly Asp Val Asn Pro Ser Gly LysLeu Thr Gln 645 650 655 Ser Phe Pro Ala Ala Glu Asn Gln His Ala Val AlaGly Asp Pro Thr 660 665 670 Ser Tyr Pro Gly Val Asp Asn Gln Gln Thr TyrArg Glu Gly Ile His 675 680 685 Val Gly Tyr Arg Trp Phe Asp Lys Glu AsnVal Lys Pro Leu Phe Pro 690 695 700 Phe Gly His Gly Leu Ser Tyr Thr SerPhe Thr Gln Ser Ala Pro Thr 705 710 715 720 Val Val Arg Thr Ser Thr GlyGly Leu Lys Val Thr Val Thr Val Arg 725 730 735 Asn Ser Gly Lys Arg AlaGly Gln Glu Val Val Gln Ala Tyr Leu Gly 740 745 750 Ala Ser Pro Asn ValThr Ala Pro Gln Ala Lys Lys Lys Leu Val Gly 755 760 765 Tyr Thr Lys ValSer Leu Ala Ala Gly Glu Ala Lys Thr Val Thr Val 770 775 780 Asn Val AspArg Arg Gln Leu Gln Thr Gly Ser Ser Ser Ala Asp Leu 785 790 795 800 ArgGly Ser Ala Thr Val Asn Val Trp 805 25 9 PRT Artificial Sequence Aconsensus sequence. 25 Leu Leu Asp Val Ala Cys Gly Thr Gly 1 5 26 1011DNA Streptomyces venezuelae 26 atggcaatgc gcgactccat accgaggcgagcggaccgcg acacccttcg ccgcgaatta 60 ggccagaact tccttcagga cgacagagccgtgcgcaatc tcgtcacgca tgtcgagggg 120 gacggtagga acgttctcga aatcggccccggaaagggcg cgataaccga ggagttggtg 180 cgctccttcg acaccgtgac ggtcgtggagatggacccgc actgggccgc gcatgtgcgg 240 cggaaattcg aaggggagag ggtcaccgtattccagggtg atttcctcga cttccgcatt 300 ccgcgcgata tcgacaccgt cgtcggaaacgttcccttcg gcatcacgac ccagattctc 360 cggagtctcc tggaatcgac gaactggcagtcggcggccc tgatagtgca gtgggaggtc 420 gcccgcaaac gcgccggtcg cagcggcggatcgctcctca cgacctcctg ggccccctgg 480 tacgagttcg cggtccacga ccgcgtccgcgcctcgtcgt tccgtccgat gccccgcgtc 540 gacggcggcg tcctgacgat caggcgacgcccccagcccc tgctgcccga gagcgcgagc 600 cgcgccttcc agaacttcgc cgaagccgtcttcaccggcc ccggacgggg cctcgcggag 660 atcctccggc gccacatccc caagcggacctaccgttccc tcgccgaccg ccacggaatt 720 ccggacggcg gactgccgaa ggacctcacgctcacccaat ggatcgccct tttccaggcc 780 tcccagccga gttacgcgcc gggggcgcccggcacgcgca tgccgggcca gggcggtggc 840 gccggcggca gggactatga ctcggagacgagcagggccg ccgtgcccgg gagccgcaga 900 tacggcccca cgcgcggcgg cgaaccctgcgcaccccgcg cacaggtccg gcagaccaag 960 ggccgccagg gcgcgcgagg ctcgtcgtacggacgccgca cgggccgtta g 1011 27 336 PRT Streptomyces venezuelae 27 MetAla Met Arg Asp Ser Ile Pro Arg Arg Ala Asp Arg Asp Thr Leu 1 5 10 15Arg Arg Glu Leu Gly Gln Asn Phe Leu Gln Asp Asp Arg Ala Val Arg 20 25 30Asn Leu Val Thr His Val Glu Gly Asp Gly Arg Asn Val Leu Glu Ile 35 40 45Gly Pro Gly Lys Gly Ala Ile Thr Glu Glu Leu Val Arg Ser Phe Asp 50 55 60Thr Val Thr Val Val Glu Met Asp Pro His Trp Ala Ala His Val Arg 65 70 7580 Arg Lys Phe Glu Gly Glu Arg Val Thr Val Phe Gln Gly Asp Phe Leu 85 9095 Asp Phe Arg Ile Pro Arg Asp Ile Asp Thr Val Val Gly Asn Val Pro 100105 110 Phe Gly Ile Thr Thr Gln Ile Leu Arg Ser Leu Leu Glu Ser Thr Asn115 120 125 Trp Gln Ser Ala Ala Leu Ile Val Gln Trp Glu Val Ala Arg LysArg 130 135 140 Ala Gly Arg Ser Gly Gly Ser Leu Leu Thr Thr Ser Trp AlaPro Trp 145 150 155 160 Tyr Glu Phe Ala Val His Asp Arg Val Arg Ala SerSer Phe Arg Pro 165 170 175 Met Pro Arg Val Asp Gly Gly Val Leu Thr IleArg Arg Arg Pro Gln 180 185 190 Pro Leu Leu Pro Glu Ser Ala Ser Arg AlaPhe Gln Asn Phe Ala Glu 195 200 205 Ala Val Phe Thr Gly Pro Gly Arg GlyLeu Ala Glu Ile Leu Arg Arg 210 215 220 His Ile Pro Lys Arg Thr Tyr ArgSer Leu Ala Asp Arg His Gly Ile 225 230 235 240 Pro Asp Gly Gly Leu ProLys Asp Leu Thr Leu Thr Gln Trp Ile Ala 245 250 255 Leu Phe Gln Ala SerGln Pro Ser Tyr Ala Pro Gly Ala Pro Gly Thr 260 265 270 Arg Met Pro GlyGln Gly Gly Gly Ala Gly Gly Arg Asp Tyr Asp Ser 275 280 285 Glu Thr SerArg Ala Ala Val Pro Gly Ser Arg Arg Tyr Gly Pro Thr 290 295 300 Arg GlyGly Glu Pro Cys Ala Pro Arg Ala Gln Val Arg Gln Thr Lys 305 310 315 320Gly Arg Gln Gly Ala Arg Gly Ser Ser Tyr Gly Arg Arg Thr Gly Arg 325 330335 28 28 000 29 29 000 30 13842 DNA Streptomyces venezuelae 30tgtcttcag ccggaattac caggaccggt gcgagaacac cggtgacagg gcgtggggcg 60cagcgtggg acacggggga agtgcgggtc cgacgggggt tgccccctgc cggccccgat 120atgcggagc actccttctc tcgtgctcct accggtgatg tgcgcgccga attgattcgt 180gagagatgt cgacagtgtc caagagtgag tccgaggaat tcgtgtccgt gtcgaacgac 240ccggttccg cgcacggcac agcggaaccc gtcgccgtcg tcggcatctc ctgccgggtg 300ccggcgccc gggacccgag agagttctgg gaactcctgg cggcaggcgg ccaggccgtc 360ccgacgtcc ccgcggaccg ctggaacgcc ggcgacttct acgacccgga ccgctccgcc 420ccggccgct cgaacagccg gtggggcggg ttcatcgagg acgtcgaccg gttcgacgcc 480ccttcttcg gcatctcgcc ccgcgaggcc gcggagatgg acccgcagca gcggctcgcc 540tggagctgg gctgggaggc cctggagcgc gccgggatcg acccgtcctc gctcaccggc 600cccgcaccg gcgtcttcgc cggcgccatc tgggacgact acgccaccct gaagcaccgc 660agggcggcg ccgcgatcac cccgcacacc gtcaccggcc tccaccgcgg catcatcgcg 720accgactct cgtacacgct cgggctccgc ggccccagca tggtcgtcga ctccggccag 780cctcgtcgc tcgtcgccgt ccacctcgcg tgcgagagcc tgcggcgcgg cgagtccgag 840tcgccctcg ccggcggcgt ctcgctcaac ctggtgccgg acagcatcat cggggcgagc 900agttcggcg gcctctcccc cgacggccgc gcctacacct tcgacgcgcg cgccaacggc 960acgtacgcg gcgagggcgg cggtttcgtc gtcctgaagc gcctctcccg ggccgtcgcc 1020acggcgacc cggtgctcgc cgtgatccgg ggcagcgccg tcaacaacgg cggcgccgcc 1080agggcatga cgacccccga cgcgcaggcg caggaggccg tgctccgcga ggcccacgag 1140gggccggga ccgcgccggc cgacgtgcgg tacgtcgagc tgcacggcac cggcaccccc 1200tgggcgacc cgatcgaggc cgctgcgctc ggcgccgccc tcggcaccgg ccgcccggcc 1260gacagccgc tcctggtcgg ctcggtcaag acgaacatcg gccacctgga gggcgcggcc 1320gcatcgccg gcctcatcaa ggccgtcctg gcggtccgcg gtcgcgcgct gcccgccagc 1380tgaactacg agaccccgaa cccggcgatc ccgttcgagg aactgaacct ccgggtgaac 1440cggagtacc tgccgtggga gccggagcac gacgggcagc ggatggtcgt cggcgtgtcc 1500cgttcggca tgggcggcac gaacgcgcat gtcgtgctcg aagaggcccc cgggggttgt 1560gaggtgctt cggtcgtgga gtcgacggtc ggcgggtcgg cggtcggcgg cggtgtggtg 1620cgtgggtgg tgtcggcgaa gtccgctgcc gcgctggacg cgcagatcga gcggcttgcc 1680cgttcgcct cgcgggatcg tacggatggt gtcgacgcgg gcgctgtcga tgcgggtgct 1740tcgatgcgg gtgctgtcgc tcgcgtactg gccggcgggc gtgctcagtt cgagcaccgg 1800ccgtcgtcg tcggcagcgg gccggacgat ctggcggcag cgctggccgc gcctgagggt 1860tggtccggg gcgtggcttc cggtgtcggg cgagtggcgt tcgtgttccc cgggcagggc 1920cgcagtggg ccggcatggg tgccgaactg ctggactctt ccgcggtgtt cgcggcggcc 1980tggccgaat gcgaggccgc actctccccg tacgtcgact ggtcgctgga ggccgtcgta 2040ggcaggccc ccggtgcgcc cacgctggag cgggtcgatg tcgtgcagcc tgtgacgttc 2100ccgtcatgg tctcgctggc tcgcgtgtgg cagcaccacg gggtgacgcc ccaggcggtc 2160tcggccact cgcagggcga gatcgccgcc gcgtacgtcg ccggtgccct gagcctggac 2220acgccgctc gtgtcgtgac cctgcgcagc aagtccatcg ccgcccacct cgccggcaag 2280gcggcatgc tgtccctcgc gctgagcgag gacgccgtcc tggagcgact ggccgggttc 2340acgggctgt ccgtcgccgc tgtgaacggg cccaccgcca ccgtggtctc cggtgacccc 2400tacagatcg aagagcttgc tcgggcgtgt gaggccgatg gggtccgtgc gcgggtcatt 2460ccgtcgact acgcgtccca cagccggcag gtcgagatca tcgagagcga gctcgccgag 2520tcctcgccg ggctcagccc gcaggctccg cgcgtgccgt tcttctcgac actcgaaggc 2580cctggatca ccgagcccgt gctcgacggc ggctactggt accgcaacct gcgccatcgt 2640tgggcttcg ccccggccgt cgagaccctg gccaccgacg agggcttcac ccacttcgtc 2700aggtcagcg cccaccccgt cctcaccatg gccctccccg ggaccgtcac cggtctggcg 2760ccctgcgtc gcgacaacgg cggtcaggac cgcctagtcg cctccctcgc cgaagcatgg 2820ccaacggac tcgcggtcga ctggagcccg ctcctcccct ccgcgaccgg ccaccactcc 2880acctcccca cctacgcgtt ccagaccgag cgccactggc tgggcgagat cgaggcgctc 2940ccccggcgg gcgagccggc ggtgcagccc gccgtcctcc gcacggaggc ggccgagccg 3000cggagctcg accgggacga gcagctgcgc gtgatcctgg acaaggtccg ggcgcagacg 3060cccaggtgc tggggtacgc gacaggcggg cagatcgagg tcgaccggac cttccgtgag 3120ccggttgca cctccctgac cggcgtggac ctgcgcaacc ggatcaacgc cgccttcggc 3180tacggatgg cgccgtccat gatcttcgac ttccccaccc ccgaggctct cgcggagcag 3240tgctcctcg tcgtgcacgg ggaggcggcg gcgaacccgg ccggtgcgga gccggctccg 3300tggcggcgg ccggtgccgt cgacgagccg gtggcgatcg tcggcatggc ctgccgcctg 3360ccggtgggg tcgcctcgcc ggaggacctg tggcggctgg tggccggcgg cggggacgcg 3420tctcggagt tcccgcagga ccgcggctgg gacgtggagg ggctgtacca cccggatccg 3480agcaccccg gcacgtcgta cgtccgccag ggcggtttca tcgagaacgt cgccggcttc 3540acgcggcct tcttcgggat ctcgccgcgc gaggccctcg ccatggaccc gcagcagcgg 3600tcctcctcg aaacctcctg ggaggccgtc gaggacgccg ggatcgaccc gacctccctg 3660ggggacggc aggtcggcgt cttcactggg gcgatgaccc acgagtacgg gccgagcctg 3720gggacggcg gggaaggcct cgacggctac ctgctgaccg gcaacacggc cagcgtgatg 3780cgggccgcg tctcgtacac actcggcctt gagggccccg ccctgacggt ggacacggcc 3840gctcgtcgt cgctggtcgc cctgcacctc gccgtgcagg ccctgcgcaa gggcgaggtc 3900acatggcgc tcgccggcgg cgtggccgtg atgcccacgc ccgggatgtt cgtcgagttc 3960gccggcagc gcgggctggc cggggacggc cggtcgaagg cgttcgccgc gtcggcggac 4020gcaccagct ggtccgaggg cgtcggcgtc ctcctcgtcg agcgcctgtc ggacgcccgc 4080gcaacggac accaggtcct cgcggtcgtc cgcggcagcg ccttgaacca ggacggcgcg 4140gcaacggcc tcacggctcc gaacgggccc tcgcagcagc gcgtcatccg gcgcgcgctg 4200cggacgccc ggctgacgac ctccgacgtg gacgtcgtcg aggcacacgg cacgggcacg 4260gactcggcg acccgatcga ggcgcaggcc ctgatcgcca cctacggcca gggccgtgac 4320acgaacagc cgctgcgcct cgggtcgttg aagtccaaca tcgggcacac ccaggccgcg 4380ccggcgtct ccggtgtcat caagatggtc caggcgatgc gccacggact gctgccgaag 4440cgctgcacg tcgacgagcc ctcggaccag atcgactggt cggctggcgc cgtggaactc 4500tcaccgagg ccgtcgactg gccggagaag caggacggcg ggctgcgccg ggccgccgtc 4560cctccttcg ggatcagcgg caccaatgcg catgtggtgc tcgaagaggc cccggtggtt 4620tcgagggtg cttcggtcgt cgagccgtcg gttggcgggt cggcggtcgg cggcggtgtg 4680cgccttggg tggtgtcggc gaagtccgct gccgcgctcg acgcgcagat cgagcggctt 4740ccgcattcg cctcgcggga tcgtacggat gacgccgacg ccggtgctgt cgacgcgggc 4800ctgtcgctc acgtactggc tgacgggcgt gctcagttcg agcaccgggc cgtcgcgctc 4860gcgccgggg cggacgacct cgtacaggcg ctggccgatc cggacgggct gatacgcgga 4920cggcttccg gtgtcgggcg agtggcgttc gtgttccccg gtcagggcac gcagtgggct 4980gcatgggtg ccgaactgct ggactcttcc gcggtgttcg cggcggccat ggccgagtgt 5040aggccgcgc tgtccccgta cgtcgactgg tcgctggagg ccgtcgtacg gcaggccccc 5100gtgcgccca cgctggagcg ggtcgatgtc gtgcagcctg tgacgttcgc cgtcatggtc 5160cgctggctc gcgtgtggca gcaccacggt gtgacgcccc aggcggtcgt cggccactcg 5220agggcgaga tcgccgccgc gtacgtcgcc ggagccctgc ccctggacga cgccgcccgc 5280tcgtcaccc tgcgcagcaa gtccatcgcc gcccacctcg ccggcaaggg cggcatgctg 5340ccctcgcgc tgaacgagga cgccgtcctg gagcgactga gtgacttcga cgggctgtcc 5400tcgccgccg tcaacgggcc caccgccact gtcgtgtcgg gtgaccccgt acagatcgaa 5460agcttgctc aggcgtgcaa ggcggacgga ttccgcgcgc ggatcattcc cgtcgactac 5520cgtcccaca gccggcaggt cgagatcatc gagagcgagc tcgcccaggt cctcgccggt 5580tcagcccgc aggccccgcg cgtgccgttc ttctcgacgc tcgaaggcac ctggatcacc 5640agcccgtcc tcgacggcac ctactggtac cgcaacctcc gtcaccgcgt cggcttcgcc 5700ccgccatcg agaccctggc cgtcgacgag ggcttcacgc acttcgtcga ggtcagcgcc 5760accccgtcc tcaccatgac cctccccgag accgtcaccg gcctcggcac cctccgtcgc 5820aacagggag gccaagagcg tctggtcacc tcgctcgccg aggcgtgggt caacgggctt 5880ccgtggcat ggacttcgct cctgcccgcc acggcctccc gccccggtct gcccacctac 5940ccttccagg ccgagcgcta ctggctcgag aacactcccg ccgccctggc caccggcgac 6000actggcgct accgcatcga ctggaagcgc ctcccggccg ccgaggggtc cgagcgcacc 6060gcctgtccg gccgctggct cgccgtcacg ccggaggacc actccgcgca ggccgccgcc 6120tgctcaccg cgctggtcga cgccggggcg aaggtcgagg tgctgacggc cggggcggac 6180acgaccgtg aggccctcgc cgcccggctc accgcactga cgaccggtga cggcttcacc 6240gcgtggtct cgctcctcga cggactcgta ccgcaggtcg cctgggtcca ggcgctcggc 6300acgccggaa tcaaggcgcc cctgtggtcc gtcacccagg gcgcggtctc cgtcggacgt 6360tcgacaccc ccgccgaccc cgaccgggcc atgctctggg gcctcggccg cgtcgtcgcc 6420ttgagcacc ccgaacgctg ggccggcctc gtcgacctcc ccgcccagcc cgatgccgcc 6480ccctcgccc acctcgtcac cgcactctcc ggcgccaccg gcgaggacca gatcgccatc 6540gcaccaccg gactccacgc ccgccgcctc gcccgcgcac ccctccacgg acgtcggccc 6600cccgcgact ggcagcccca cggcaccgtc ctcatcaccg gcggcaccgg agccctcggc 6660gccacgccg cacgctggat ggcccaccac ggagccgaac acctcctcct cgtcagccgc 6720gcggcgaac aagcccccgg agccacccaa ctcaccgccg aactcaccgc atcgggcgcc 6780gcgtcacca tcgccgcctg cgacgtcgcc gacccccacg ccatgcgcac cctcctcgac 6840ccatccccg ccgagacgcc cctcaccgcc gtcgtccaca ccgccggcgc gctcgacgac 6900gcatcgtgg acacgctgac cgccgagcag gtccggcggg cccaccgtgc gaaggccgtc 6960gcgcctcgg tgctcgacga gctgacccgg gacctcgacc tcgacgcgtt cgtgctcttc 7020cgtccgtgt cgagcactct gggcatcccc ggtcagggca actacgcccc gcacaacgcc 7080acctcgacg ccctcgcggc tcgccgccgg gccaccggcc ggtccgccgt ctcggtggcc 7140ggggaccgt gggacggtgg cggcatggcc gccggtgacg gcgtggccga gcggctgcgc 7200accacggcg tgcccggcat ggacccggaa ctcgccctgg ccgcactgga gtccgcgctc 7260gccgggacg agaccgcgat caccgtcgcg gacatcgact gggaccgctt ctacctcgcg 7320actcctccg gtcgcccgca gcccctcgtc gaggagctgc ccgaggtgcg gcgcatcatc 7380acgcacggg acagcgccac gtccggacag ggcgggagct ccgcccaggg cgccaacccc 7440tggccgagc ggctggccgc cgcggctccc ggcgagcgta cggagatcct cctcggtctc 7500tacgggcgc aggccgccgc cgtgctccgg atgcgttcgc cggaggacgt cgccgccgac 7560gcgccttca aggacatcgg cttcgactcg ctcgccggtg tcgagctgcg caacaggctg 7620cccgggcga ccgggctcca gctgcccgcg acgctcgtct tcgaccaccc gacgccgctg 7680ccctcgtgt cgctgctccg cagcgagttc ctcggtgacg aggagacggc ggacgcccgg 7740ggtccgcgg cgctgcccgc gactgtcggt gccggtgccg gcgccggcgc cggcaccgat 7800ccgacgacg atccgatcgc gatcgtcgcg atgagctgcc gctaccccgg tgacatccgc 7860gcccggagg acctgtggcg gatgctgtcc gagggcggcg agggcatcac gccgttcccc 7920ccgaccgcg gctgggacct cgacggcctg tacgacgccg acccggacgc gctcggcagg 7980cgtacgtcc gcgagggcgg gttcctgcac gacgcggccg agttcgacgc ggagttcttc 8040gcgtctcgc cgcgcgaggc gctggccatg gacccgcagc agcggatgct cctgacgacg 8100cctgggagg ccttcgagcg ggccggcatc gagccggcat cgctgcgcgg cagcagcacc 8160gtgtcttca tcggcctctc ctaccaggac tacgcggccc gcgtcccgaa cgccccgcgt 8220gcgtggagg gttacctgct gaccggcagc acgccgagcg tcgcgtcggg ccgtatcgcg 8280acaccttcg gtctcgaagg gcccgcgacg accgtcgaca ccgcctgctc gtcgtcgctg 8340ccgccctgc acctggcggt gcgggcgctg cgcagcggcg agtgcacgat ggcgctcgcc 8400gtggcgtgg cgatgatggc gaccccgcac atgttcgtgg agttcagccg tcagcgggcg 8460tcgccccgg acggccgcag caaggccttc tcggcggacg ccgacgggtt cggcgccgcg 8520agggcgtcg gcctgctgct cgtggagcgg ctctcggacg cgcggcgcaa cggtcacccg 8580tgctcgccg tggtccgcgg taccgccgtc aaccaggacg gcgccagcaa cgggctgacc 8640cgcccaacg gaccctcgca gcagcgggtg atccggcagg cgctcgccga cgcccggctg 8700cacccggcg acatcgacgc cgtcgagacg cacggcacgg gaacctcgct gggcgacccc 8760tcgaggccc agggcctcca ggccacgtac ggcaaggagc ggcccgcgga acggccgctc 8820ccatcggct ccgtgaagtc caacatcgga cacacccagg ccgcggccgg tgcggcgggc 8880tcatcaaga tggtcctcgc gatgcgccac ggcaccctgc cgaagaccct ccacgccgac 8940agccgagcc cgcacgtcga ctgggcgaac agcggcctgg ccctcgtcac cgagccgatc 9000actggccgg ccggcaccgg tccgcgccgc gccgccgtct cctccttcgg catcagcggg 9060cgaacgcgc acgtcgtgct ggagcaggcg ccggatgctg ctggtgaggt gcttggggcc 9120atgaggtgc ctgaggtgtc tgagacggta gcgatggctg ggacggctgg gacctccgag 9180tcgctgagg gctctgaggc ctccgaggcc cccgcggccc ccggcagccg tgaggcgtcc 9240tccccgggc acctgccctg ggtgctgtcc gccaaggacg agcagtcgct gcgcggccag 9300ccgccgccc tgcacgcgtg gctgtccgag cccgccgccg acctgtcgga cgcggacgga 9360cggcccgcc tgcgggacgt cgggtacacg ctcgccacga gccgtaccgc cttcgcgcac 9420gcgccgccg tgaccgccgc cgaccgggac gggttcctgg acgggctggc cacgctggcc 9480agggcggca cctcggccca cgtccacctg gacaccgccc gggacggcac caccgcgttc 9540tcttcaccg gccagggcag tcagcgcccc ggcgccggcc gtgagctgta cgaccggcac 9600ccgtcttcg cccgggcgct cgacgagatc tgcgcccacc tcgacggtca cctcgaactg 9660ccctgctcg acgtgatgtt cgcggccgag ggcagcgcgg aggccgcgct gctcgacgag 9720cgcggtaca cgcagtgcgc gctgttcgcc ctggaggtcg cgctcttccg gctcgtcgag 9780gctggggca tgcggccggc cgcactgctc ggtcactcgg tcggcgagat cgccgccgcg 9840acgtcgccg gtgtgttctc gctcgccgac gccgcccgcc tggtcgccgc gcgcggccgg 9900tcatgcagg agctgcccgc cggtggcgcg atgctcgccg tccaggccgc ggaggacgag 9960tccgcgtgt ggctggagac ggaggagcgg tacgcgggac gtctggacgt cgccgccgtc 10020acggccccg aggccgccgt cctgtccggc gacgcggacg cggcgcggga ggcggaggcg 10080actggtccg ggctcggccg caggacccgc gcgctgcggg tcagccacgc cttccactcc 10140cgcacatgg acggcatgct cgacgggttc cgcgccgtcc tggagacggt ggagttccgg 10200gcccctccc tgaccgtggt ctcgaacgtc accggcctgg ccgccggccc ggacgacctg 10260gcgaccccg agtactgggt ccggcacgtc cgcggcaccg tccgcttcct cgacggcgtc 10320gtgtcctgc gcgacctcgg cgtgcggacc tgcctggagc tgggccccga cggggtcctc 10380ccgccatgg cggccgacgg cctcgcggac acccccgcgg attccgctgc cggctccccc 10440tcggctctc ccgccggctc tcccgccgac tccgccgccg gcgcgctccg gccccggccg 10500tgctcgtgg cgctgctgcg ccgcaagcgg tcggagaccg agaccgtcgc ggacgccctc 10560gcagggcgc acgcccacgg caccggaccc gactggcacg cctggttcgc cggctccggg 10620cgcaccgcg tggacctgcc cacgtactcc ttccggcgcg accgctactg gctggacgcc 10680cggcggccg acaccgcggt ggacaccgcc ggcctcggtc tcggcaccgc cgaccacccg 10740tgctcggcg ccgtggtcag ccttccggac cgggacggcc tgctgctcac cggccgcctc 10800ccctgcgca cccacccgtg gctcgcggac cacgccgtcc tggggagcgt cctgctcccc 10860gcgccgcga tggtcgaact cgccgcgcac gctgcggagt ccgccggtct gcgtgacgtg 10920gggagctga ccctccttga accgctggta ctgcccgagc acggtggcgt cgagctgcgc 10980tgacggtcg gggcgccggc cggagagccc ggtggcgagt cggccgggga cggcgcacgg 11040ccgtctccc tccactcgcg gctcgccgac gcgcccgccg gtaccgcctg gtcctgccac 11100cgaccggtc tgctggccac cgaccggccc gagcttcccg tcgcgcccga ccgtgcggcc 11160tgtggccgc cgcagggcgc cgaggaggtg ccgctcgacg gtctctacga gcggctcgac 11220ggaacggcc tcgccttcgg tccgctgttc caggggctga acgcggtgtg gcggtacgag 11280gtgaggtct tcgccgacat cgcgctcccc gccaccacga atgcgaccgc gcccgcgacc 11340cgaacggcg gcgggagtgc ggcggcggcc ccctacggca tccaccccgc cctgctcgac 11400cttcgctgc acgccatcgc ggtcggcggt ctcgtcgacg agcccgagct cgtccgcgtc 11460ccttccact ggagcggtgt caccgtgcac gcggccggtg ccgcggcggc ccgggtccgt 11520tcgcctccg cggggacgga cgccgtctcg ctgtccctga cggacggcga gggacgcccg 11580tggtctccg tggaacggct cacgctgcgc ccggtcaccg ccgatcaggc ggcggcgagc 11640gcgtcggcg ggctgatgca ccgggtggcc tggcgtccgt acgccctcgc ctcgtccggc 11700aacaggacc cgcacgccac ttcgtacggg ccgaccgccg tcctcggcaa ggacgagctg 11760aggtcgccg ccgccctgga gtccgcgggc gtcgaagtcg ggctctaccc cgacctggcc 11820cgctgtccc aggacgtggc ggccggcgcc ccggcgcccc gtaccgtcct tgcgccgctg 11880ccgcgggtc ccgccgacgg cggcgcggag ggtgtacggg gcacggtggc ccggacgctg 11940agctgctcc aggcctggct ggccgacgag cacctcgcgg gcacccgcct gctcctggtc 12000cccgcggtg cggtgcggga ccccgagggg tccggcgccg acgatggcgg cgaggacctg 12060cgcacgcgg ccgcctgggg tctcgtacgg accgcgcaga ccgagaaccc cggccgcttc 12120gccttctcg acctggccga cgacgcctcg tcgtaccgga ccctgccgtc ggtgctctcc 12180acgcgggcc tgcgcgacga accgcagctc gccctgcacg acggcaccat caggctggcc 12240gcctggcct ccgtccggcc cgagaccggc accgccgcac cggcgctcgc cccggagggc 12300cggtcctgc tgaccggcgg caccggcggc ctgggcggac tggtcgcccg gcacgtggtg 12360gcgagtggg gcgtacgacg cctgctgctg gtgagccggc ggggcacgga cgccccgggc 12420ccgacgagc tcgtgcacga gctggaggcc ctgggagccg acgtctcggt ggccgcgtgc 12480acgtcgccg accgcgaagc cctcaccgcc gtactcgacg ccatccccgc cgaacacccg 12540tcaccgcgg tcgtccacac ggcaggcgtc ctctccgacg gcaccctccc gtccatgacg 12600cggaggacg tggaacacgt actgcggccc aaggtcgacg ccgcgttcct cctcgacgaa 12660tcacctcga cgcccgcata cgacctggca gcgttcgtca tgttctcctc cgccgccgcc 12720tcttcggtg gcgcggggca gggcgcctac gccgccgcca acgccaccct cgacgccctc 12780cctggcgcc gccgggcagc cggactcccc gccctctccc tcggctgggg cctctgggcc 12840agaccagcg gcatgaccgg cgagctcggc caggcggacc tgcgccggat gagccgcgcg 12900gcatcggcg ggatcagcga cgccgagggc atcgcgctcc tcgacgccgc cctccgcgac 12960accgccacc cggtcctgct gcccctgcgg ctcgacgccg ccgggctgcg ggacgcggcc 13020ggaacgacc cggccggaat cccggcgctc ttccgggacg tcgtcggcgc caggaccgtc 13080gggcccggc cgtccgcggc ctccgcctcg acgacagccg ggacggccgg cacgccgggg 13140cggcggacg gcgcggcgga aacggcggcg gtcacgctcg ccgaccgggc cgccaccgtg 13200acgggcccg cacggcagcg cctgctgctc gagttcgtcg tcggcgaggt cgccgaagta 13260tcggccacg cccgcggtca ccggatcgac gccgaacggg gcttcctcga cctcggcttc 13320actccctga ccgccgtcga actccgcaac cggctcaact ccgccggtgg cctcgccctc 13380cggcgaccc tggtcttcga ccacccaagc ccggcggcac tcgcctccca cctggacgcc 13440agctgccgc gcggcgcctc ggaccaggac ggagccggga accggaacgg gaacgagaac 13500ggacgacgg cgtcccggag caccgccgag acggacgcgc tgctggcaca actgacccgc 13560tggaaggcg ccttggtgct gacgggcctc tcggacgccc ccgggagcga agaagtcctg 13620agcacctgc ggtccctgcg ctcgatggtc acgggcgaga ccgggaccgg gaccgcgtcc 13680gagccccgg acggcgccgg gtccggcgcc gaggaccggc cctgggcggc cggggacgga 13740ccgggggcg ggagtgagga cggcgcggga gtgccggact tcatgaacgc ctcggccgag 13800aactcttcg gcctcctcga ccaggacccc agcacggact ga 13842 31 4613 PRTStreptomyces venezuelae 31 Met Ser Ser Ala Gly Ile Thr Arg Thr Gly AlaArg Thr Pro Val Thr 1 5 10 15 Gly Arg Gly Ala Ala Ala Trp Asp Thr GlyGlu Val Arg Val Arg Arg 20 25 30 Gly Leu Pro Pro Ala Gly Pro Asp His AlaGlu His Ser Phe Ser Arg 35 40 45 Ala Pro Thr Gly Asp Val Arg Ala Glu LeuIle Arg Gly Glu Met Ser 50 55 60 Thr Val Ser Lys Ser Glu Ser Glu Glu PheVal Ser Val Ser Asn Asp 65 70 75 80 Ala Gly Ser Ala His Gly Thr Ala GluPro Val Ala Val Val Gly Ile 85 90 95 Ser Cys Arg Val Pro Gly Ala Arg AspPro Arg Glu Phe Trp Glu Leu 100 105 110 Leu Ala Ala Gly Gly Gln Ala ValThr Asp Val Pro Ala Asp Arg Trp 115 120 125 Asn Ala Gly Asp Phe Tyr AspPro Asp Arg Ser Ala Pro Gly Arg Ser 130 135 140 Asn Ser Arg Trp Gly GlyPhe Ile Glu Asp Val Asp Arg Phe Asp Ala 145 150 155 160 Ala Phe Phe GlyIle Ser Pro Arg Glu Ala Ala Glu Met Asp Pro Gln 165 170 175 Gln Arg LeuAla Leu Glu Leu Gly Trp Glu Ala Leu Glu Arg Ala Gly 180 185 190 Ile AspPro Ser Ser Leu Thr Gly Thr Arg Thr Gly Val Phe Ala Gly 195 200 205 AlaIle Trp Asp Asp Tyr Ala Thr Leu Lys His Arg Gln Gly Gly Ala 210 215 220Ala Ile Thr Pro His Thr Val Thr Gly Leu His Arg Gly Ile Ile Ala 225 230235 240 Asn Arg Leu Ser Tyr Thr Leu Gly Leu Arg Gly Pro Ser Met Val Val245 250 255 Asp Ser Gly Gln Ser Ser Ser Leu Val Ala Val His Leu Ala CysGlu 260 265 270 Ser Leu Arg Arg Gly Glu Ser Glu Leu Ala Leu Ala Gly GlyVal Ser 275 280 285 Leu Asn Leu Val Pro Asp Ser Ile Ile Gly Ala Ser LysPhe Gly Gly 290 295 300 Leu Ser Pro Asp Gly Arg Ala Tyr Thr Phe Asp AlaArg Ala Asn Gly 305 310 315 320 Tyr Val Arg Gly Glu Gly Gly Gly Phe ValVal Leu Lys Arg Leu Ser 325 330 335 Arg Ala Val Ala Asp Gly Asp Pro ValLeu Ala Val Ile Arg Gly Ser 340 345 350 Ala Val Asn Asn Gly Gly Ala AlaGln Gly Met Thr Thr Pro Asp Ala 355 360 365 Gln Ala Gln Glu Ala Val LeuArg Glu Ala His Glu Arg Ala Gly Thr 370 375 380 Ala Pro Ala Asp Val ArgTyr Val Glu Leu His Gly Thr Gly Thr Pro 385 390 395 400 Val Gly Asp ProIle Glu Ala Ala Ala Leu Gly Ala Ala Leu Gly Thr 405 410 415 Gly Arg ProAla Gly Gln Pro Leu Leu Val Gly Ser Val Lys Thr Asn 420 425 430 Ile GlyHis Leu Glu Gly Ala Ala Gly Ile Ala Gly Leu Ile Lys Ala 435 440 445 ValLeu Ala Val Arg Gly Arg Ala Leu Pro Ala Ser Leu Asn Tyr Glu 450 455 460Thr Pro Asn Pro Ala Ile Pro Phe Glu Glu Leu Asn Leu Arg Val Asn 465 470475 480 Thr Glu Tyr Leu Pro Trp Glu Pro Glu His Asp Gly Gln Arg Met Val485 490 495 Val Gly Val Ser Ser Phe Gly Met Gly Gly Thr Asn Ala His ValVal 500 505 510 Leu Glu Glu Ala Pro Gly Gly Cys Arg Gly Ala Ser Val ValGlu Ser 515 520 525 Thr Val Gly Gly Ser Ala Val Gly Gly Gly Val Val ProTrp Val Val 530 535 540 Ser Ala Lys Ser Ala Ala Ala Leu Asp Ala Gln IleGlu Arg Leu Ala 545 550 555 560 Ala Phe Ala Ser Arg Asp Arg Thr Asp GlyVal Asp Ala Gly Ala Val 565 570 575 Asp Ala Gly Ala Val Asp Ala Gly AlaVal Ala Arg Val Leu Ala Gly 580 585 590 Gly Arg Ala Gln Phe Glu His ArgAla Val Val Val Gly Ser Gly Pro 595 600 605 Asp Asp Leu Ala Ala Ala LeuAla Ala Pro Glu Gly Leu Val Arg Gly 610 615 620 Val Ala Ser Gly Val GlyArg Val Ala Phe Val Phe Pro Gly Gln Gly 625 630 635 640 Thr Gln Trp AlaGly Met Gly Ala Glu Leu Leu Asp Ser Ser Ala Val 645 650 655 Phe Ala AlaAla Met Ala Glu Cys Glu Ala Ala Leu Ser Pro Tyr Val 660 665 670 Asp TrpSer Leu Glu Ala Val Val Arg Gln Ala Pro Gly Ala Pro Thr 675 680 685 LeuGlu Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val Met Val 690 695 700Ser Leu Ala Arg Val Trp Gln His His Gly Val Thr Pro Gln Ala Val 705 710715 720 Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala725 730 735 Leu Ser Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser LysSer 740 745 750 Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Leu Ser LeuAla Leu 755 760 765 Ser Glu Asp Ala Val Leu Glu Arg Leu Ala Gly Phe AspGly Leu Ser 770 775 780 Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val ValSer Gly Asp Pro 785 790 795 800 Val Gln Ile Glu Glu Leu Ala Arg Ala CysGlu Ala Asp Gly Val Arg 805 810 815 Ala Arg Val Ile Pro Val Asp Tyr AlaSer His Ser Arg Gln Val Glu 820 825 830 Ile Ile Glu Ser Glu Leu Ala GluVal Leu Ala Gly Leu Ser Pro Gln 835 840 845 Ala Pro Arg Val Pro Phe PheSer Thr Leu Glu Gly Ala Trp Ile Thr 850 855 860 Glu Pro Val Leu Asp GlyGly Tyr Trp Tyr Arg Asn Leu Arg His Arg 865 870 875 880 Val Gly Phe AlaPro Ala Val Glu Thr Leu Ala Thr Asp Glu Gly Phe 885 890 895 Thr His PheVal Glu Val Ser Ala His Pro Val Leu Thr Met Ala Leu 900 905 910 Pro GlyThr Val Thr Gly Leu Ala Thr Leu Arg Arg Asp Asn Gly Gly 915 920 925 GlnAsp Arg Leu Val Ala Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu 930 935 940Ala Val Asp Trp Ser Pro Leu Leu Pro Ser Ala Thr Gly His His Ser 945 950955 960 Asp Leu Pro Thr Tyr Ala Phe Gln Thr Glu Arg His Trp Leu Gly Glu965 970 975 Ile Glu Ala Leu Ala Pro Ala Gly Glu Pro Ala Val Gln Pro AlaVal 980 985 990 Leu Arg Thr Glu Ala Ala Glu Pro Ala Glu Leu Asp Arg AspGlu Gln 995 1000 1005 Leu Arg Val Ile Leu Asp Lys Val Arg Ala Gln ThrAla Gln Val Leu 1010 1015 1020 Gly Tyr Ala Thr Gly Gly Gln Ile Glu ValAsp Arg Thr Phe Arg Glu 1025 1030 1035 1040 Ala Gly Cys Thr Ser Leu ThrGly Val Asp Leu Arg Asn Arg Ile Asn 1045 1050 1055 Ala Ala Phe Gly ValArg Met Ala Pro Ser Met Ile Phe Asp Phe Pro 1060 1065 1070 Thr Pro GluAla Leu Ala Glu Gln Leu Leu Leu Val Val His Gly Glu 1075 1080 1085 AlaAla Ala Asn Pro Ala Gly Ala Glu Pro Ala Pro Val Ala Ala Ala 1090 10951100 Gly Ala Val Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu1105 1110 1115 1120 Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg LeuVal Ala Gly 1125 1130 1135 Gly Gly Asp Ala Ile Ser Glu Phe Pro Gln AspArg Gly Trp Asp Val 1140 1145 1150 Glu Gly Leu Tyr His Pro Asp Pro GluHis Pro Gly Thr Ser Tyr Val 1155 1160 1165 Arg Gln Gly Gly Phe Ile GluAsn Val Ala Gly Phe Asp Ala Ala Phe 1170 1175 1180 Phe Gly Ile Ser ProArg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg 1185 1190 1195 1200 Leu LeuLeu Glu Thr Ser Trp Glu Ala Val Glu Asp Ala Gly Ile Asp 1205 1210 1215Pro Thr Ser Leu Arg Gly Arg Gln Val Gly Val Phe Thr Gly Ala Met 12201225 1230 Thr His Glu Tyr Gly Pro Ser Leu Arg Asp Gly Gly Glu Gly LeuAsp 1235 1240 1245 Gly Tyr Leu Leu Thr Gly Asn Thr Ala Ser Val Met SerGly Arg Val 1250 1255 1260 Ser Tyr Thr Leu Gly Leu Glu Gly Pro Ala LeuThr Val Asp Thr Ala 1265 1270 1275 1280 Cys Ser Ser Ser Leu Val Ala LeuHis Leu Ala Val Gln Ala Leu Arg 1285 1290 1295 Lys Gly Glu Val Asp MetAla Leu Ala Gly Gly Val Ala Val Met Pro 1300 1305 1310 Thr Pro Gly MetPhe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Gly 1315 1320 1325 Asp GlyArg Ser Lys Ala Phe Ala Ala Ser Ala Asp Gly Thr Ser Trp 1330 1335 1340Ser Glu Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg 13451350 1355 1360 Arg Asn Gly His Gln Val Leu Ala Val Val Arg Gly Ser AlaLeu Asn 1365 1370 1375 Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro AsnGly Pro Ser Gln 1380 1385 1390 Gln Arg Val Ile Arg Arg Ala Leu Ala AspAla Arg Leu Thr Thr Ser 1395 1400 1405 Asp Val Asp Val Val Glu Ala HisGly Thr Gly Thr Arg Leu Gly Asp 1410 1415 1420 Pro Ile Glu Ala Gln AlaLeu Ile Ala Thr Tyr Gly Gln Gly Arg Asp 1425 1430 1435 1440 Asp Glu GlnPro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His 1445 1450 1455 ThrGln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val Gln Ala 1460 14651470 Met Arg His Gly Leu Leu Pro Lys Thr Leu His Val Asp Glu Pro Ser1475 1480 1485 Asp Gln Ile Asp Trp Ser Ala Gly Ala Val Glu Leu Leu ThrGlu Ala 1490 1495 1500 Val Asp Trp Pro Glu Lys Gln Asp Gly Gly Leu ArgArg Ala Ala Val 1505 1510 1515 1520 Ser Ser Phe Gly Ile Ser Gly Thr AsnAla His Val Val Leu Glu Glu 1525 1530 1535 Ala Pro Val Val Val Glu GlyAla Ser Val Val Glu Pro Ser Val Gly 1540 1545 1550 Gly Ser Ala Val GlyGly Gly Val Thr Pro Trp Val Val Ser Ala Lys 1555 1560 1565 Ser Ala AlaAla Leu Asp Ala Gln Ile Glu Arg Leu Ala Ala Phe Ala 1570 1575 1580 SerArg Asp Arg Thr Asp Asp Ala Asp Ala Gly Ala Val Asp Ala Gly 1585 15901595 1600 Ala Val Ala His Val Leu Ala Asp Gly Arg Ala Gln Phe Glu HisArg 1605 1610 1615 Ala Val Ala Leu Gly Ala Gly Ala Asp Asp Leu Val GlnAla Leu Ala 1620 1625 1630 Asp Pro Asp Gly Leu Ile Arg Gly Thr Ala SerGly Val Gly Arg Val 1635 1640 1645 Ala Phe Val Phe Pro Gly Gln Gly ThrGln Trp Ala Gly Met Gly Ala 1650 1655 1660 Glu Leu Leu Asp Ser Ser AlaVal Phe Ala Ala Ala Met Ala Glu Cys 1665 1670 1675 1680 Glu Ala Ala LeuSer Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val 1685 1690 1695 Arg GlnAla Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val Val Gln 1700 1705 1710Pro Val Thr Phe Ala Val Met Val Ser Leu Ala Arg Val Trp Gln His 17151720 1725 His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly GluIle 1730 1735 1740 Ala Ala Ala Tyr Val Ala Gly Ala Leu Pro Leu Asp AspAla Ala Arg 1745 1750 1755 1760 Val Val Thr Leu Arg Ser Lys Ser Ile AlaAla His Leu Ala Gly Lys 1765 1770 1775 Gly Gly Met Leu Ser Leu Ala LeuAsn Glu Asp Ala Val Leu Glu Arg 1780 1785 1790 Leu Ser Asp Phe Asp GlyLeu Ser Val Ala Ala Val Asn Gly Pro Thr 1795 1800 1805 Ala Thr Val ValSer Gly Asp Pro Val Gln Ile Glu Glu Leu Ala Gln 1810 1815 1820 Ala CysLys Ala Asp Gly Phe Arg Ala Arg Ile Ile Pro Val Asp Tyr 1825 1830 18351840 Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Ser Glu Leu Ala Gln1845 1850 1855 Val Leu Ala Gly Leu Ser Pro Gln Ala Pro Arg Val Pro PhePhe Ser 1860 1865 1870 Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val LeuAsp Gly Thr Tyr 1875 1880 1885 Trp Tyr Arg Asn Leu Arg His Arg Val GlyPhe Ala Pro Ala Ile Glu 1890 1895 1900 Thr Leu Ala Val Asp Glu Gly PheThr His Phe Val Glu Val Ser Ala 1905 1910 1915 1920 His Pro Val Leu ThrMet Thr Leu Pro Glu Thr Val Thr Gly Leu Gly 1925 1930 1935 Thr Leu ArgArg Glu Gln Gly Gly Gln Glu Arg Leu Val Thr Ser Leu 1940 1945 1950 AlaGlu Ala Trp Val Asn Gly Leu Pro Val Ala Trp Thr Ser Leu Leu 1955 19601965 Pro Ala Thr Ala Ser Arg Pro Gly Leu Pro Thr Tyr Ala Phe Gln Ala1970 1975 1980 Glu Arg Tyr Trp Leu Glu Asn Thr Pro Ala Ala Leu Ala ThrGly Asp 1985 1990 1995 2000 Asp Trp Arg Tyr Arg Ile Asp Trp Lys Arg LeuPro Ala Ala Glu Gly 2005 2010 2015 Ser Glu Arg Thr Gly Leu Ser Gly ArgTrp Leu Ala Val Thr Pro Glu 2020 2025 2030 Asp His Ser Ala Gln Ala AlaAla Val Leu Thr Ala Leu Val Asp Ala 2035 2040 2045 Gly Ala Lys Val GluVal Leu Thr Ala Gly Ala Asp Asp Asp Arg Glu 2050 2055 2060 Ala Leu AlaAla Arg Leu Thr Ala Leu Thr Thr Gly Asp Gly Phe Thr 2065 2070 2075 2080Gly Val Val Ser Leu Leu Asp Gly Leu Val Pro Gln Val Ala Trp Val 20852090 2095 Gln Ala Leu Gly Asp Ala Gly Ile Lys Ala Pro Leu Trp Ser ValThr 2100 2105 2110 Gln Gly Ala Val Ser Val Gly Arg Leu Asp Thr Pro AlaAsp Pro Asp 2115 2120 2125 Arg Ala Met Leu Trp Gly Leu Gly Arg Val ValAla Leu Glu His Pro 2130 2135 2140 Glu Arg Trp Ala Gly Leu Val Asp LeuPro Ala Gln Pro Asp Ala Ala 2145 2150 2155 2160 Ala Leu Ala His Leu ValThr Ala Leu Ser Gly Ala Thr Gly Glu Asp 2165 2170 2175 Gln Ile Ala IleArg Thr Thr Gly Leu His Ala Arg Arg Leu Ala Arg 2180 2185 2190 Ala ProLeu His Gly Arg Arg Pro Thr Arg Asp Trp Gln Pro His Gly 2195 2200 2205Thr Val Leu Ile Thr Gly Gly Thr Gly Ala Leu Gly Ser His Ala Ala 22102215 2220 Arg Trp Met Ala His His Gly Ala Glu His Leu Leu Leu Val SerArg 2225 2230 2235 2240 Ser Gly Glu Gln Ala Pro Gly Ala Thr Gln Leu ThrAla Glu Leu Thr 2245 2250 2255 Ala Ser Gly Ala Arg Val Thr Ile Ala AlaCys Asp Val Ala Asp Pro 2260 2265 2270 His Ala Met Arg Thr Leu Leu AspAla Ile Pro Ala Glu Thr Pro Leu 2275 2280 2285 Thr Ala Val Val His ThrAla Gly Ala Leu Asp Asp Gly Ile Val Asp 2290 2295 2300 Thr Leu Thr AlaGlu Gln Val Arg Arg Ala His Arg Ala Lys Ala Val 2305 2310 2315 2320 GlyAla Ser Val Leu Asp Glu Leu Thr Arg Asp Leu Asp Leu Asp Ala 2325 23302335 Phe Val Leu Phe Ser Ser Val Ser Ser Thr Leu Gly Ile Pro Gly Gln2340 2345 2350 Gly Asn Tyr Ala Pro His Asn Ala Tyr Leu Asp Ala Leu AlaAla Arg 2355 2360 2365 Arg Arg Ala Thr Gly Arg Ser Ala Val Ser Val AlaTrp Gly Pro Trp 2370 2375 2380 Asp Gly Gly Gly Met Ala Ala Gly Asp GlyVal Ala Glu Arg Leu Arg 2385 2390 2395 2400 Asn His Gly Val Pro Gly MetAsp Pro Glu Leu Ala Leu Ala Ala Leu 2405 2410 2415 Glu Ser Ala Leu GlyArg Asp Glu Thr Ala Ile Thr Val Ala Asp Ile 2420 2425 2430 Asp Trp AspArg Phe Tyr Leu Ala Tyr Ser Ser Gly Arg Pro Gln Pro 2435 2440 2445 LeuVal Glu Glu Leu Pro Glu Val Arg Arg Ile Ile Asp Ala Arg Asp 2450 24552460 Ser Ala Thr Ser Gly Gln Gly Gly Ser Ser Ala Gln Gly Ala Asn Pro2465 2470 2475 2480 Leu Ala Glu Arg Leu Ala Ala Ala Ala Pro Gly Glu ArgThr Glu Ile 2485 2490 2495 Leu Leu Gly Leu Val Arg Ala Gln Ala Ala AlaVal Leu Arg Met Arg 2500 2505 2510 Ser Pro Glu Asp Val Ala Ala Asp ArgAla Phe Lys Asp Ile Gly Phe 2515 2520 2525 Asp Ser Leu Ala Gly Val GluLeu Arg Asn Arg Leu Thr Arg Ala Thr 2530 2535 2540 Gly Leu Gln Leu ProAla Thr Leu Val Phe Asp His Pro Thr Pro Leu 2545 2550 2555 2560 Ala LeuVal Ser Leu Leu Arg Ser Glu Phe Leu Gly Asp Glu Glu Thr 2565 2570 2575Ala Asp Ala Arg Arg Ser Ala Ala Leu Pro Ala Thr Val Gly Ala Gly 25802585 2590 Ala Gly Ala Gly Ala Gly Thr Asp Ala Asp Asp Asp Pro Ile AlaIle 2595 2600 2605 Val Ala Met Ser Cys Arg Tyr Pro Gly Asp Ile Arg SerPro Glu Asp 2610 2615 2620 Leu Trp Arg Met Leu Ser Glu Gly Gly Glu GlyIle Thr Pro Phe Pro 2625 2630 2635 2640 Thr Asp Arg Gly Trp Asp Leu AspGly Leu Tyr Asp Ala Asp Pro Asp 2645 2650 2655 Ala Leu Gly Arg Ala TyrVal Arg Glu Gly Gly Phe Leu His Asp Ala 2660 2665 2670 Ala Glu Phe AspAla Glu Phe Phe Gly Val Ser Pro Arg Glu Ala Leu 2675 2680 2685 Ala MetAsp Pro Gln Gln Arg Met Leu Leu Thr Thr Ser Trp Glu Ala 2690 2695 2700Phe Glu Arg Ala Gly Ile Glu Pro Ala Ser Leu Arg Gly Ser Ser Thr 27052710 2715 2720 Gly Val Phe Ile Gly Leu Ser Tyr Gln Asp Tyr Ala Ala ArgVal Pro 2725 2730 2735 Asn Ala Pro Arg Gly Val Glu Gly Tyr Leu Leu ThrGly Ser Thr Pro 2740 2745 2750 Ser Val Ala Ser Gly Arg Ile Ala Tyr ThrPhe Gly Leu Glu Gly Pro 2755 2760 2765 Ala Thr Thr Val Asp Thr Ala CysSer Ser Ser Leu Thr Ala Leu His 2770 2775 2780 Leu Ala Val Arg Ala LeuArg Ser Gly Glu Cys Thr Met Ala Leu Ala 2785 2790 2795 2800 Gly Gly ValAla Met Met Ala Thr Pro His Met Phe Val Glu Phe Ser 2805 2810 2815 ArgGln Arg Ala Leu Ala Pro Asp Gly Arg Ser Lys Ala Phe Ser Ala 2820 28252830 Asp Ala Asp Gly Phe Gly Ala Ala Glu Gly Val Gly Leu Leu Leu Val2835 2840 2845 Glu Arg Leu Ser Asp Ala Arg Arg Asn Gly His Pro Val LeuAla Val 2850 2855 2860 Val Arg Gly Thr Ala Val Asn Gln Asp Gly Ala SerAsn Gly Leu Thr 2865 2870 2875 2880 Ala Pro Asn Gly Pro Ser Gln Gln ArgVal Ile Arg Gln Ala Leu Ala 2885 2890 2895 Asp Ala Arg Leu Ala Pro GlyAsp Ile Asp Ala Val Glu Thr His Gly 2900 2905 2910 Thr Gly Thr Ser LeuGly Asp Pro Ile Glu Ala Gln Gly Leu Gln Ala 2915 2920 2925 Thr Tyr GlyLys Glu Arg Pro Ala Glu Arg Pro Leu Ala Ile Gly Ser 2930 2935 2940 ValLys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Ala Ala Gly 2945 29502955 2960 Ile Ile Lys Met Val Leu Ala Met Arg His Gly Thr Leu Pro LysThr 2965 2970 2975 Leu His Ala Asp Glu Pro Ser Pro His Val Asp Trp AlaAsn Ser Gly 2980 2985 2990 Leu Ala Leu Val Thr Glu Pro Ile Asp Trp ProAla Gly Thr Gly Pro 2995 3000 3005 Arg Arg Ala Ala Val Ser Ser Phe GlyIle Ser Gly Thr Asn Ala His 3010 3015 3020 Val Val Leu Glu Gln Ala ProAsp Ala Ala Gly Glu Val Leu Gly Ala 3025 3030 3035 3040 Asp Glu Val ProGlu Val Ser Glu Thr Val Ala Met Ala Gly Thr Ala 3045 3050 3055 Gly ThrSer Glu Val Ala Glu Gly Ser Glu Ala Ser Glu Ala Pro Ala 3060 3065 3070Ala Pro Gly Ser Arg Glu Ala Ser Leu Pro Gly His Leu Pro Trp Val 30753080 3085 Leu Ser Ala Lys Asp Glu Gln Ser Leu Arg Gly Gln Ala Ala AlaLeu 3090 3095 3100 His Ala Trp Leu Ser Glu Pro Ala Ala Asp Leu Ser AspAla Asp Gly 3105 3110 3115 3120 Pro Ala Arg Leu Arg Asp Val Gly Tyr ThrLeu Ala Thr Ser Arg Thr 3125 3130 3135 Ala Phe Ala His Arg Ala Ala ValThr Ala Ala Asp Arg Asp Gly Phe 3140 3145 3150 Leu Asp Gly Leu Ala ThrLeu Ala Gln Gly Gly Thr Ser Ala His Val 3155 3160 3165 His Leu Asp ThrAla Arg Asp Gly Thr Thr Ala Phe Leu Phe Thr Gly 3170 3175 3180 Gln GlySer Gln Arg Pro Gly Ala Gly Arg Glu Leu Tyr Asp Arg His 3185 3190 31953200 Pro Val Phe Ala Arg Ala Leu Asp Glu Ile Cys Ala His Leu Asp Gly3205 3210 3215 His Leu Glu Leu Pro Leu Leu Asp Val Met Phe Ala Ala GluGly Ser 3220 3225 3230 Ala Glu Ala Ala Leu Leu Asp Glu Thr Arg Tyr ThrGln Cys Ala Leu 3235 3240 3245 Phe Ala Leu Glu Val Ala Leu Phe Arg LeuVal Glu Ser Trp Gly Met 3250 3255 3260 Arg Pro Ala Ala Leu Leu Gly HisSer Val Gly Glu Ile Ala Ala Ala 3265 3270 3275 3280 His Val Ala Gly ValPhe Ser Leu Ala Asp Ala Ala Arg Leu Val Ala 3285 3290 3295 Ala Arg GlyArg Leu Met Gln Glu Leu Pro Ala Gly Gly Ala Met Leu 3300 3305 3310 AlaVal Gln Ala Ala Glu Asp Glu Ile Arg Val Trp Leu Glu Thr Glu 3315 33203325 Glu Arg Tyr Ala Gly Arg Leu Asp Val Ala Ala Val Asn Gly Pro Glu3330 3335 3340 Ala Ala Val Leu Ser Gly Asp Ala Asp Ala Ala Arg Glu AlaGlu Ala 3345 3350 3355 3360 Tyr Trp Ser Gly Leu Gly Arg Arg Thr Arg AlaLeu Arg Val Ser His 3365 3370 3375 Ala Phe His Ser Ala His Met Asp GlyMet Leu Asp Gly Phe Arg Ala 3380 3385 3390 Val Leu Glu Thr Val Glu PheArg Arg Pro Ser Leu Thr Val Val Ser 3395 3400 3405 Asn Val Thr Gly LeuAla Ala Gly Pro Asp Asp Leu Cys Asp Pro Glu 3410 3415 3420 Tyr Trp ValArg His Val Arg Gly Thr Val Arg Phe Leu Asp Gly Val 3425 3430 3435 3440Arg Val Leu Arg Asp Leu Gly Val Arg Thr Cys Leu Glu Leu Gly Pro 34453450 3455 Asp Gly Val Leu Thr Ala Met Ala Ala Asp Gly Leu Ala Asp ThrPro 3460 3465 3470 Ala Asp Ser Ala Ala Gly Ser Pro Val Gly Ser Pro AlaGly Ser Pro 3475 3480 3485 Ala Asp Ser Ala Ala Gly Ala Leu Arg Pro ArgPro Leu Leu Val Ala 3490 3495 3500 Leu Leu Arg Arg Lys Arg Ser Glu ThrGlu Thr Val Ala Asp Ala Leu 3505 3510 3515 3520 Gly Arg Ala His Ala HisGly Thr Gly Pro Asp Trp His Ala Trp Phe 3525 3530 3535 Ala Gly Ser GlyAla His Arg Val Asp Leu Pro Thr Tyr Ser Phe Arg 3540 3545 3550 Arg AspArg Tyr Trp Leu Asp Ala Pro Ala Ala Asp Thr Ala Val Asp 3555 3560 3565Thr Ala Gly Leu Gly Leu Gly Thr Ala Asp His Pro Leu Leu Gly Ala 35703575 3580 Val Val Ser Leu Pro Asp Arg Asp Gly Leu Leu Leu Thr Gly ArgLeu 3585 3590 3595 3600 Ser Leu Arg Thr His Pro Trp Leu Ala Asp His AlaVal Leu Gly Ser 3605 3610 3615 Val Leu Leu Pro Gly Ala Ala Met Val GluLeu Ala Ala His Ala Ala 3620 3625 3630 Glu Ser Ala Gly Leu Arg Asp ValArg Glu Leu Thr Leu Leu Glu Pro 3635 3640 3645 Leu Val Leu Pro Glu HisGly Gly Val Glu Leu Arg Val Thr Val Gly 3650 3655 3660 Ala Pro Ala GlyGlu Pro Gly Gly Glu Ser Ala Gly Asp Gly Ala Arg 3665 3670 3675 3680 ProVal Ser Leu His Ser Arg Leu Ala Asp Ala Pro Ala Gly Thr Ala 3685 36903695 Trp Ser Cys His Ala Thr Gly Leu Leu Ala Thr Asp Arg Pro Glu Leu3700 3705 3710 Pro Val Ala Pro Asp Arg Ala Ala Met Trp Pro Pro Gln GlyAla Glu 3715 3720 3725 Glu Val Pro Leu Asp Gly Leu Tyr Glu Arg Leu AspGly Asn Gly Leu 3730 3735 3740 Ala Phe Gly Pro Leu Phe Gln Gly Leu AsnAla Val Trp Arg Tyr Glu 3745 3750 3755 3760 Gly Glu Val Phe Ala Asp IleAla Leu Pro Ala Thr Thr Asn Ala Thr 3765 3770 3775 Ala Pro Ala Thr AlaAsn Gly Gly Gly Ser Ala Ala Ala Ala Pro Tyr 3780 3785 3790 Gly Ile HisPro Ala Leu Leu Asp Ala Ser Leu His Ala Ile Ala Val 3795 3800 3805 GlyGly Leu Val Asp Glu Pro Glu Leu Val Arg Val Pro Phe His Trp 3810 38153820 Ser Gly Val Thr Val His Ala Ala Gly Ala Ala Ala Ala Arg Val Arg3825 3830 3835 3840 Leu Ala Ser Ala Gly Thr Asp Ala Val Ser Leu Ser LeuThr Asp Gly 3845 3850 3855 Glu Gly Arg Pro Leu Val Ser Val Glu Arg LeuThr Leu Arg Pro Val 3860 3865 3870 Thr Ala Asp Gln Ala Ala Ala Ser ArgVal Gly Gly Leu Met His Arg 3875 3880 3885 Val Ala Trp Arg Pro Tyr AlaLeu Ala Ser Ser Gly Glu Gln Asp Pro 3890 3895 3900 His Ala Thr Ser TyrGly Pro Thr Ala Val Leu Gly Lys Asp Glu Leu 3905 3910 3915 3920 Lys ValAla Ala Ala Leu Glu Ser Ala Gly Val Glu Val Gly Leu Tyr 3925 3930 3935Pro Asp Leu Ala Ala Leu Ser Gln Asp Val Ala Ala Gly Ala Pro Ala 39403945 3950 Pro Arg Thr Val Leu Ala Pro Leu Pro Ala Gly Pro Ala Asp GlyGly 3955 3960 3965 Ala Glu Gly Val Arg Gly Thr Val Ala Arg Thr Leu GluLeu Leu Gln 3970 3975 3980 Ala Trp Leu Ala Asp Glu His Leu Ala Gly ThrArg Leu Leu Leu Val 3985 3990 3995 4000 Thr Arg Gly Ala Val Arg Asp ProGlu Gly Ser Gly Ala Asp Asp Gly 4005 4010 4015 Gly Glu Asp Leu Ser HisAla Ala Ala Trp Gly Leu Val Arg Thr Ala 4020 4025 4030 Gln Thr Glu AsnPro Gly Arg Phe Gly Leu Leu Asp Leu Ala Asp Asp 4035 4040 4045 Ala SerSer Tyr Arg Thr Leu Pro Ser Val Leu Ser Asp Ala Gly Leu 4050 4055 4060Arg Asp Glu Pro Gln Leu Ala Leu His Asp Gly Thr Ile Arg Leu Ala 40654070 4075 4080 Arg Leu Ala Ser Val Arg Pro Glu Thr Gly Thr Ala Ala ProAla Leu 4085 4090 4095 Ala Pro Glu Gly Thr Val Leu Leu Thr Gly Gly ThrGly Gly Leu Gly 4100 4105 4110 Gly Leu Val Ala Arg His Val Val Gly GluTrp Gly Val Arg Arg Leu 4115 4120 4125 Leu Leu Val Ser Arg Arg Gly ThrAsp Ala Pro Gly Ala Asp Glu Leu 4130 4135 4140 Val His Glu Leu Glu AlaLeu Gly Ala Asp Val Ser Val Ala Ala Cys 4145 4150 4155 4160 Asp Val AlaAsp Arg Glu Ala Leu Thr Ala Val Leu Asp Ala Ile Pro 4165 4170 4175 AlaGlu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu Ser 4180 41854190 Asp Gly Thr Leu Pro Ser Met Thr Thr Glu Asp Val Glu His Val Leu4195 4200 4205 Arg Pro Lys Val Asp Ala Ala Phe Leu Leu Asp Glu Leu ThrSer Thr 4210 4215 4220 Pro Ala Tyr Asp Leu Ala Ala Phe Val Met Phe SerSer Ala Ala Ala 4225 4230 4235 4240 Val Phe Gly Gly Ala Gly Gln Gly AlaTyr Ala Ala Ala Asn Ala Thr 4245 4250 4255 Leu Asp Ala Leu Ala Trp ArgArg Arg Ala Ala Gly Leu Pro Ala Leu 4260 4265 4270 Ser Leu Gly Trp GlyLeu Trp Ala Glu Thr Ser Gly Met Thr Gly Glu 4275 4280 4285 Leu Gly GlnAla Asp Leu Arg Arg Met Ser Arg Ala Gly Ile Gly Gly 4290 4295 4300 IleSer Asp Ala Glu Gly Ile Ala Leu Leu Asp Ala Ala Leu Arg Asp 4305 43104315 4320 Asp Arg His Pro Val Leu Leu Pro Leu Arg Leu Asp Ala Ala GlyLeu 4325 4330 4335 Arg Asp Ala Ala Gly Asn Asp Pro Ala Gly Ile Pro AlaLeu Phe Arg 4340 4345 4350 Asp Val Val Gly Ala Arg Thr Val Arg Ala ArgPro Ser Ala Ala Ser 4355 4360 4365 Ala Ser Thr Thr Ala Gly Thr Ala GlyThr Pro Gly Thr Ala Asp Gly 4370 4375 4380 Ala Ala Glu Thr Ala Ala ValThr Leu Ala Asp Arg Ala Ala Thr Val 4385 4390 4395 4400 Asp Gly Pro AlaArg Gln Arg Leu Leu Leu Glu Phe Val Val Gly Glu 4405 4410 4415 Val AlaGlu Val Leu Gly His Ala Arg Gly His Arg Ile Asp Ala Glu 4420 4425 4430Arg Gly Phe Leu Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu Leu 44354440 4445 Arg Asn Arg Leu Asn Ser Ala Gly Gly Leu Ala Leu Pro Ala ThrLeu 4450 4455 4460 Val Phe Asp His Pro Ser Pro Ala Ala Leu Ala Ser HisLeu Asp Ala 4465 4470 4475 4480 Glu Leu Pro Arg Gly Ala Ser Asp Gln AspGly Ala Gly Asn Arg Asn 4485 4490 4495 Gly Asn Glu Asn Gly Thr Thr AlaSer Arg Ser Thr Ala Glu Thr Asp 4500 4505 4510 Ala Leu Leu Ala Gln LeuThr Arg Leu Glu Gly Ala Leu Val Leu Thr 4515 4520 4525 Gly Leu Ser AspAla Pro Gly Ser Glu Glu Val Leu Glu His Leu Arg 4530 4535 4540 Ser LeuArg Ser Met Val Thr Gly Glu Thr Gly Thr Gly Thr Ala Ser 4545 4550 45554560 Gly Ala Pro Asp Gly Ala Gly Ser Gly Ala Glu Asp Arg Pro Trp Ala4565 4570 4575 Ala Gly Asp Gly Ala Gly Gly Gly Ser Glu Asp Gly Ala GlyVal Pro 4580 4585 4590 Asp Phe Met Asn Ala Ser Ala Glu Glu Leu Phe GlyLeu Leu Asp Gln 4595 4600 4605 Asp Pro Ser Thr Asp 4610 32 11220 DNAStreptomyces venezuelae 32 gtgtccacgg tgaacgaaga gaagtacctc gactacctgcgtcgtgccac ggcggacctc 60 cacgaggccc gtggccgcct ccgcgagctg gaggcgaaggcgggcgagcc ggtggcgatc 120 gtcggcatgg cctgccgcct gcccggcggc gtcgcctcgcccgaggacct gtggcggctg 180 gtggccggcg gcgaggacgc gatctcggag ttcccccaggaccgcggctg ggacgtggag 240 ggcctgtacg acccgaaccc ggaggccacg ggcaagagttacgcccgcga ggccggattc 300 ctgtacgagg cgggcgagtt cgacgccgac ttcttcgggatctcgccgcg cgaggccctc 360 gccatggacc cgcagcagcg tctcctcctg gaggcctcctgggaggcgtt cgagcacgcc 420 gggatcccgg cggccaccgc gcgcggcacc tcggtcggcgtcttcaccgg cgtgatgtac 480 cacgactacg ccacccgtct caccgatgtc ccggagggcatcgagggcta cctgggcacc 540 ggcaactccg gcagtgtcgc ctcgggccgc gtcgcgtacacgcttggcct ggaggggccg 600 gccgtcacgg tcgacaccgc ctgctcgtcc tcgctggtcgccctgcacct cgccgtgcag 660 gccctgcgca agggcgaggt cgacatggcg ctcgccggcggcgtgacggt catgtcgacg 720 cccagcacct tcgtcgagtt cagccgtcag cgcgggctggcgccggacgg ccggtcgaag 780 tccttctcgt cgacggccga cggcaccagc tggtccgagggcgtcggcgt cctcctcgtc 840 gagcgcctgt ccgacgcgcg tcgcaagggc catcggatcctcgccgtggt ccggggcacc 900 gccgtcaacc aggacggcgc cagcagcggc ctcacggctccgaacgggcc gtcgcagcag 960 cgcgtcatcc gacgtgccct ggcggacgcc cggctcacgacctccgacgt ggacgtcgtc 1020 gaggcccacg gcacgggtac gcgactcggc gacccgatcgaggcgcaggc cgtcatcgcc 1080 acgtacgggc agggccgtga cggcgaacag ccgctgcgcctcgggtcgtt gaagtccaac 1140 atcggacaca cccaggccgc cgccggtgtc tccggcgtgatcaagatggt ccaggcgatg 1200 cgccacggcg tcctgccgaa gacgctccac gtggagaagccgacggacca ggtggactgg 1260 tccgcgggcg cggtcgagct gctcaccgag gccatggactggccggacaa gggcgacggc 1320 ggactgcgca gggccgcggt ctcctccttc ggcgtcagcgggacgaacgc gcacgtcgtg 1380 ctcgaagagg ccccggcggc cgaggagacc cctgcctccgaggcgacccc ggccgtcgag 1440 ccgtcggtcg gcgccggcct ggtgccgtgg ctggtgtcggcgaagactcc ggccgcgctg 1500 gacgcccaga tcggacgcct cgccgcgttc gcctcgcagggccgtacgga cgccgccgat 1560 ccgggcgcgg tcgctcgcgt actggccggc gggcgcgccgagttcgagca ccgggccgtc 1620 gtgctcggca ccggacagga cgatttcgcg caggcgctgaccgctccgga aggactgata 1680 cgcggcacgc cctcggacgt gggccgggtg gcgttcgtgttccccggtca gggcacgcag 1740 tgggccggga tgggcgccga actcctcgac gtgtcgaaggagttcgcggc ggccatggcc 1800 gagtgcgaga gcgcgctctc ccgctatgtc gactggtcgctggaggccgt cgtccggcag 1860 gcgccgggcg cgcccacgct ggagcgggtc gacgtcgtccagcccgtgac cttcgctgtc 1920 atggtttcgc tggcgaaggt ctggcagcac cacggcgtgacgccgcaggc cgtcgtcggc 1980 cactcgcagg gcgagatcgc cgccgcgtac gtcgccggtgccctcaccct cgacgacgcc 2040 gcccgcgtcg tcaccctgcg cagcaagtcc atcgccgcccacctcgccgg caagggcggc 2100 atgatctccc tcgccctcag cgaggaagcc acccggcagcgcatcgagaa cctccacgga 2160 ctgtcgatcg ccgccgtcaa cggccccacc gccaccgtggtttcgggcga ccccacccag 2220 atccaagagc tcgctcaggc gtgtgaggcc gacggggtccgcgcacggat catccccgtc 2280 gactacgcct cccacagcgc ccacgtcgag accatcgagagcgaactcgc cgaggtcctc 2340 gccgggctca gcccgcggac acctgaggtg ccgttcttctcgacactcga aggcgcctgg 2400 atcaccgagc cggtgctcga cggcacctac tggtaccgcaacctccgcca ccgcgtcggc 2460 ttcgcccccg ccgtcgagac cctcgccacc gacgaaggcttcacccactt catcgaggtc 2520 agcgcccacc ccgtcctcac catgaccctc cccgagaccgtcaccggcct cggcaccctc 2580 cgccgcgaac agggaggcca ggagcgtctg gtcacctcactcgccgaagc ctggaccaac 2640 ggcctcacca tcgactgggc gcccgtcctc cccaccgcaaccggccacca ccccgagctc 2700 cccacctacg ccttccagcg ccgtcactac tggctccacgactcccccgc cgtccagggc 2760 tccgtgcagg actcctggcg ctaccgcatc gactggaagcgcctcgcggt cgccgacgcg 2820 tccgagcgcg ccgggctgtc cgggcgctgg ctcgtcgtcgtccccgagga ccgttccgcc 2880 gaggccgccc cggtgctcgc cgcgctgtcc ggcgccggcgccgaccccgt acagctggac 2940 gtgtccccgc tgggcgaccg gcagcggctc gccgcgacgctgggcgaggc cctggcggcg 3000 gccggtggag ccgtcgacgg cgtcctctcg ctgctcgcgtgggacgagag cgcgcacccc 3060 ggccaccccg cccccttcac ccggggcacc ggcgccaccctcaccctggt gcaggcgctg 3120 gaggacgccg gcgtcgccgc cccgctgtgg tgcgtgacccacggcgcggt gtccgtcggc 3180 cgggccgacc acgtcacctc ccccgcccag gccatggtgtggggcatggg ccgggtcgcc 3240 gccctggagc accccgagcg gtggggcggc ctgatcgacctgccctcgga cgccgaccgg 3300 gcggccctgg accgcatgac cacggtcctc gccggcggtacgggtgagga ccaggtcgcg 3360 gtacgcgcct ccgggctgct cgcccgccgc ctcgtccgcgcctccctccc ggcgcacggc 3420 acggcttcgc cgtggtggca ggccgacggc acggtgctcgtcaccggtgc cgaggagcct 3480 gcggccgccg aggccgcacg ccggctggcc cgcgacggcgccggacacct cctcctccac 3540 accaccccct ccggcagcga aggcgccgaa ggcacctccggtgccgccga ggactccggc 3600 ctcgccgggc tcgtcgccga actcgcggac ctgggcgcgacggccaccgt cgtgacctgc 3660 gacctcacgg acgcggaggc ggccgcccgg ctgctcgccggcgtctccga cgcgcacccg 3720 ctcagcgccg tcctccacct gccgcccacc gtcgactccgagccgctcgc cgcgaccgac 3780 gcggacgcgc tcgcccgtgt cgtgaccgcg aaggccaccgccgcgctcca cctggaccgc 3840 ctcctgcggg aggccgcggc tgccggaggc cgtccgcccgtcctggtcct cttctcctcg 3900 gtcgccgcga tctggggcgg cgccggtcag ggcgcgtacgccgccggtac ggccttcctc 3960 gacgccctcg ccggtcagca ccgggccgac ggccccaccgtgacctcggt ggcctggagc 4020 ccctgggagg gcagccgcgt caccgagggt gcgaccggggagcggctgcg ccgcctcggc 4080 ctgcgccccc tcgcccccgc gacggcgctc accgccctggacaccgcgct cggccacggc 4140 gacaccgccg tcacgatcgc cgacgtcgac tggtcgagcttcgcccccgg cttcaccacg 4200 gcccggccgg gcaccctcct cgccgatctg cccgaggcgcgccgcgcgct cgacgagcag 4260 cagtcgacga cggccgccga cgacaccgtc ctgagccgcgagctcggtgc gctcaccggc 4320 gccgaacagc agcgccgtat gcaggagttg gtccgcgagcacctcgccgt ggtcctcaac 4380 cacccctccc ccgaggccgt cgacacgggg cgggccttccgtgacctcgg attcgactcg 4440 ctgacggcgg tcgagctccg caaccgcctc aagaacgccaccggcctggc cctcccggcc 4500 actctggtct tcgactaccc gaccccccgg acgctggcggagttcctcct cgcggagatc 4560 ctgggcgagc aggccggtgc cggcgagcag cttccggtggacggcggggt cgacgacgag 4620 cccgtcgcga tcgtcggcat ggcgtgccgc ctgccgggcggtgtcgcctc gccggaggac 4680 ctgtggcggc tggtggccgg cggcgaggac gcgatctccggcttcccgca ggaccgcggc 4740 tgggacgtgg aggggctgta cgacccggac ccggacgcgtccgggcggac gtactgccgt 4800 gccggtggct tcctcgacga ggcgggcgag ttcgacgccgacttcttcgg gatctcgccg 4860 cgcgaggccc tcgccatgga cccgcagcag cggctcctcctggagacctc ctgggaggcc 4920 gtcgaggacg ccgggatcga cccgacctcc cttcaggggcagcaggtcgg cgtgttcgcg 4980 ggcaccaacg gcccccacta cgagccgctg ctccgcaacaccgccgagga tcttgagggt 5040 tacgtcggga cgggcaacgc cgccagcatc atgtcgggccgtgtctcgta caccctcggc 5100 ctggagggcc cggccgtcac ggtcgacacc gcctgctcctcctcgctggt cgccctgcac 5160 ctcgccgtgc aggccctgcg caagggcgaa tgcggactggcgctcgcggg cggtgtgacg 5220 gtcatgtcga cgcccacgac gttcgtggag ttcagccggcagcgcgggct cgcggaggac 5280 ggccggtcga aggcgttcgc cgcgtcggcg gacggcttcggcccggcgga gggcgtcggc 5340 atgctcctcg tcgagcgcct gtcggacgcc cgccgcaacggacaccgtgt gctggcggtc 5400 gtgcgcggca gcgcggtcaa ccaggacggc gcgagcaacggcctgaccgc cccgaacggg 5460 ccctcgcagc agcgcgtcat ccggcgcgcg ctcgcggacgcccgactgac gaccgccgac 5520 gtggacgtcg tcgaggccca cggcacgggc acgcgactcggcgacccgat cgaggcacag 5580 gccctcatcg ccacctacgg ccaggggcgc gacaccgaacagccgctgcg cctggggtcg 5640 ttgaagtcca acatcggaca cacccaggcc gccgccggtgtctccggcat catcaagatg 5700 gtccaggcga tgcgccacgg cgtcctgccg aagacgctccacgtggaccg gccgtcggac 5760 cagatcgact ggtcggcggg cacggtcgag ctgctcaccgaggccatgga ctggccgagg 5820 aagcaggagg gcgggctgcg ccgcgcggcc gtctcctccttcggcatcag cggcacgaac 5880 gcgcacatcg tgctcgaaga agccccggtc gacgaggacgccccggcgga cgagccgtcg 5940 gtcggcggtg tggtgccgtg gctcgtgtcc gcgaagactccggccgcgct ggacgcccag 6000 atcggacgcc tcgccgcgtt cgcctcgcag ggccgtacggacgccgccga tccgggcgcg 6060 gtcgctcgcg tactggccgg cgggcgtgcg cagttcgagcaccgggccgt cgcgctcggc 6120 accggacagg acgacctggc ggccgcactg gccgcgcctgagggtctggt ccggggtgtg 6180 gcctccggtg tgggtcgagt ggcgttcgtg ttcccgggacagggcacgca gtgggccggg 6240 atgggtgccg aactcctcga cgtgtcgaag gagttcgcggcggccatggc cgagtgcgag 6300 gccgcgctcg ctccgtacgt ggactggtcg ctggaggccgtcgtccgaca ggcccccggc 6360 gcgcccacgc tggagcgggt cgatgtcgtc cagcccgtgacgttcgccgt catggtctcg 6420 ctggcgaagg tctggcagca ccacggggtg accccgcaagccgtcgtcgg ccactcgcag 6480 ggcgagatcg ccgccgcgta cgtcgccggt gccctgagcctggacgacgc cgctcgtgtc 6540 gtgaccctgc gcagcaagtc catcggcgcc cacctcgcgggccagggcgg catgctgtcc 6600 ctcgcgctga gcgaggcggc cgttgtggag cgactggccgggttcgacgg gctgtccgtc 6660 gccgccgtca acgggcctac cgccaccgtg gtttcgggcgacccgaccca gatccaagag 6720 ctcgctcagg cgtgtgaggc cgacggggtc cgcgcacggatcatccccgt cgactacgcc 6780 tcccacagcg cccacgtcga gaccatcgag agcgaactcgccgacgtcct ggcggggttg 6840 tccccccaga caccccaggt ccccttcttc tccaccctcgaaggcgcctg gatcaccgaa 6900 cccgccctcg acggcggcta ctggtaccgc aacctccgccatcgtgtggg cttcgccccg 6960 gccgtcgaaa ccctggccac cgacgaaggc ttcacccacttcgtcgaggt cagcgcccac 7020 cccgtcctca ccatggcgct gcccgagacc gtcaccggactcggcaccct ccgccgtgac 7080 aacggcggac agcaccgcct caccacctcc ctcgccgaggcctgggccaa cggcctcacc 7140 gtcgactggg cctctctcct ccccaccacg accacccaccccgatctgcc cacctacgcc 7200 ttccagaccg agcgctactg gccgcagccc gacctctccgccgccggtga catcacctcc 7260 gccggtctcg gggcggccga gcacccgctg ctcggcgcggccgtggcgct cgcggactcc 7320 gacggctgcc tgctcacggg gagcctctcc ctccgtacgcacccctggct ggcggaccac 7380 gcggtggccg gcaccgtgct gctgccggga acggcgttcgtggagctggc gttccgagcc 7440 ggggaccagg tcggttgcga tctggtcgag gagctcaccctcgacgcgcc gctcgtgctg 7500 ccccgtcgtg gcgcggtccg tgtgcagctg tccgtcggcgcgagcgacga gtccgggcgt 7560 cgtaccttcg ggctctacgc gcacccggag gacgcgccgggcgaggcgga gtggacgcgg 7620 cacgccaccg gtgtgctggc cgcccgtgcg gaccgcaccgcccccgtcgc cgacccggag 7680 gcctggccgc cgccgggcgc cgagccggtg gacgtggacggtctgtacga gcgcttcgcg 7740 gcgaacggct acggctacgg ccccctcttc cagggcgtccgtggtgtctg gcggcgtggc 7800 gacgaggtgt tcgccgacgt ggccctgccg gccgaggtcgccggtgccga gggcgcgcgg 7860 ttcggccttc acccggcgct gctcgacgcc gccgtgcaggcggccggtgc gggccggggc 7920 gttcggcgcg ggcacgcggc tgccgttcgc ctggagcgggatctcctgta cgcggtcggc 7980 gccaccgccc tccgcgtgcg gctggccccc gccggcccggacacggtgtc cgtgagcgcc 8040 gccgactcct ccgggcagcc ggtgttcgcc gcggactccctcacggtgct gcccgtcgac 8100 cccgcgcagc tggcggcctt cagcgacccg actctggacgcgctgcacct gctggagtgg 8160 accgcctggg acggtgccgc gcaggccctg cccggcgcggtcgtgctggg cggcgacgcc 8220 gacggtctcg ccgcggcgct gcgcgccggt ggcaccgaggtcctgtcctt cccggacctt 8280 acggacctgg tggaggccgt cgaccggggc gagaccccggccccggcgac cgtcctggtg 8340 gcctgccccg ccgccggccc cgatgggccg gagcatgtccgcgaggccct gcacgggtcg 8400 ctcgcgctga tgcaggcctg gctggccgac gagcggttcaccgatgggcg cctggtgctc 8460 gtgacccgcg acgcggtcgc cgcccgttcc ggcgacggcctgcggtccac gggacaggcc 8520 gccgtctggg gcctcggccg gtccgcgcag acggagagcccgggccggtt cgtcctgctc 8580 gacctcgccg gggaagcccg gacggccggg gacgccaccgccggggacgg cctgacgacc 8640 ggggacgcca ccgtcggcgg cacctctgga gacgccgccctcggcagcgc cctcgcgacc 8700 gccctcggct cgggcgagcc gcagctcgcc ctccgggacggggcgctcct cgtaccccgc 8760 ctggcgcggg ccgccgcgcc cgccgcggcc gacggcctcgccgcggccga cggcctcgcc 8820 gctctgccgc tgcccgccgc tccggccctc tggcgtctggagcccggtac ggacggcagc 8880 ctggagagcc tcacggcggc gcccggcgac gccgagaccctcgccccgga gccgctcggc 8940 ccgggacagg tccgcatcgc gatccgggcc accggtctcaacttccgcga cgtcctgatc 9000 gccctcggca tgtaccccga tccggcgctg atgggcaccgagggagccgg cgtggtcacc 9060 gcgaccggcc ccggcgtcac gcacctcgcc cccggcgaccgggtcatggg cctgctctcc 9120 ggcgcgtacg ccccggtcgt cgtggcggac gcgcggaccgtcgcgcggat gcccgagggg 9180 tggacgttcg cccagggcgc ctccgtgccg gtggtgttcctgacggccgt ctacgccctg 9240 cgcgacctgg cggacgtcaa gcccggcgag cgcctcctggtccactccgc cgccggtggc 9300 gtgggcatgg ccgccgtgca gctcgcccgg cactggggcgtggaggtcca cggcacggcg 9360 agtcacggga agtgggacgc cctgcgcgcg ctcggcctggacgacgcgca catcgcctcc 9420 tcccgcaccc tggacttcga gtccgcgttc cgtgccgcttccggcggggc gggcatggac 9480 gtcgtactga actcgctcgc ccgcgagttc gtcgacgcctcgctgcgcct gctcgggccg 9540 ggcggccggt tcgtggagat ggggaagacc gacgtccgcgacgcggagcg ggtcgccgcc 9600 gaccaccccg gtgtcggcta ccgcgccttc gacctgggcgaggccgggcc ggagcggatc 9660 ggcgagatgc tcgccgaggt catcgccctc ttcgaggacggggtgctccg gcacctgccc 9720 gtcacgacct gggacgtgcg ccgggcccgc gacgccttccggcacgtcag ccaggcccgc 9780 cacacgggca aggtcgtcct cacgatgccg tcgggcctcgacccggaggg tacggtcctg 9840 ctgaccggcg gcaccggtgc gctggggggc atcgtggcccggcacgtggt gggcgagtgg 9900 ggcgtacgac gcctgctgct cgtgagccgg cggggcacggacgccccggg cgccggcgag 9960 ctcgtgcacg agctggaggc cctgggagcc gacgtctcggtggccgcgtg cgacgtcgcc 10020 gaccgcgaag ccctcaccgc cgtactcgac tcgatccccgccgaacaccc gctcaccgcg 10080 gtcgtccaca cggcaggcgt cctctccgac ggcaccctcccctcgatgac agcggaggat 10140 gtggaacacg tactgcgtcc caaggtcgac gccgcgttcctcctcgacga actcacctcg 10200 acgcccggct acgacctggc agcgttcgtc atgttctcctccgccgccgc cgtcttcggt 10260 ggcgcggggc agggcgccta cgccgccgcc aacgccaccctcgacgccct cgcctggcgc 10320 cgccggacag ccggactccc cgccctctcc ctcggctggggcctctgggc cgagaccagc 10380 ggcatgaccg gcggactcag cgacaccgac cgctcgcggctggcccgttc cggggcgacg 10440 cccatggaca gcgagctgac cctgtccctc ctggacgcggccatgcgccg cgacgacccg 10500 gcgctcgtcc cgatcgccct ggacgtcgcc gcgctccgcgcccagcagcg cgacggcatg 10560 ctggcgccgc tgctcagcgg gctcacccgc ggatcgcgggtcggcggcgc gccggtcaac 10620 cagcgcaggg cagccgccgg aggcgcgggc gaggcggacacggacctcgg cgggcggctc 10680 gccgcgatga caccggacga ccgggtcgcg cacctgcgggacctcgtccg tacgcacgtg 10740 gcgaccgtcc tgggacacgg caccccgagc cgggtggacctggagcgggc cttccgcgac 10800 accggtttcg actcgctcac cgccgtcgaa ctccgcaaccgtctcaacgc cgcgaccggg 10860 ctgcggctgc cggccacgct ggtcttcgac caccccaccccgggggagct cgccgggcac 10920 ctgctcgacg aactcgccac ggccgcgggc gggtcctgggcggaaggcac cgggtccgga 10980 gacacggcct cggcgaccga tcggcagacc acggcggccctcgccgaact cgaccggctg 11040 gaaggcgtgc tcgcctccct cgcgcccgcc gccggcggccgtccggagct cgccgcccgg 11100 ctcagggcgc tggccgcggc cctgggggac gacggcgacgacgccaccga cctggacgag 11160 gcgtccgacg acgacctctt ctccttcatc gacaaggagctgggcgactc cgacttctga 11220 33 3739 PRT Streptomyces venezuelae 33 MetSer Thr Val Asn Glu Glu Lys Tyr Leu Asp Tyr Leu Arg Arg Ala 1 5 10 15Thr Ala Asp Leu His Glu Ala Arg Gly Arg Leu Arg Glu Leu Glu Ala 20 25 30Lys Ala Gly Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro 35 40 45Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Gly Gly 50 55 60Glu Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val Glu 65 70 7580 Gly Leu Tyr Asp Pro Asn Pro Glu Ala Thr Gly Lys Ser Tyr Ala Arg 85 9095 Glu Ala Gly Phe Leu Tyr Glu Ala Gly Glu Phe Asp Ala Asp Phe Phe 100105 110 Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu115 120 125 Leu Leu Glu Ala Ser Trp Glu Ala Phe Glu His Ala Gly Ile ProAla 130 135 140 Ala Thr Ala Arg Gly Thr Ser Val Gly Val Phe Thr Gly ValMet Tyr 145 150 155 160 His Asp Tyr Ala Thr Arg Leu Thr Asp Val Pro GluGly Ile Glu Gly 165 170 175 Tyr Leu Gly Thr Gly Asn Ser Gly Ser Val AlaSer Gly Arg Val Ala 180 185 190 Tyr Thr Leu Gly Leu Glu Gly Pro Ala ValThr Val Asp Thr Ala Cys 195 200 205 Ser Ser Ser Leu Val Ala Leu His LeuAla Val Gln Ala Leu Arg Lys 210 215 220 Gly Glu Val Asp Met Ala Leu AlaGly Gly Val Thr Val Met Ser Thr 225 230 235 240 Pro Ser Thr Phe Val GluPhe Ser Arg Gln Arg Gly Leu Ala Pro Asp 245 250 255 Gly Arg Ser Lys SerPhe Ser Ser Thr Ala Asp Gly Thr Ser Trp Ser 260 265 270 Glu Gly Val GlyVal Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg 275 280 285 Lys Gly HisArg Ile Leu Ala Val Val Arg Gly Thr Ala Val Asn Gln 290 295 300 Asp GlyAla Ser Ser Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln 305 310 315 320Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Thr Thr Ser Asp 325 330335 Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp Pro 340345 350 Ile Glu Ala Gln Ala Val Ile Ala Thr Tyr Gly Gln Gly Arg Asp Gly355 360 365 Glu Gln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly HisThr 370 375 380 Gln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val GlnAla Met 385 390 395 400 Arg His Gly Val Leu Pro Lys Thr Leu His Val GluLys Pro Thr Asp 405 410 415 Gln Val Asp Trp Ser Ala Gly Ala Val Glu LeuLeu Thr Glu Ala Met 420 425 430 Asp Trp Pro Asp Lys Gly Asp Gly Gly LeuArg Arg Ala Ala Val Ser 435 440 445 Ser Phe Gly Val Ser Gly Thr Asn AlaHis Val Val Leu Glu Glu Ala 450 455 460 Pro Ala Ala Glu Glu Thr Pro AlaSer Glu Ala Thr Pro Ala Val Glu 465 470 475 480 Pro Ser Val Gly Ala GlyLeu Val Pro Trp Leu Val Ser Ala Lys Thr 485 490 495 Pro Ala Ala Leu AspAla Gln Ile Gly Arg Leu Ala Ala Phe Ala Ser 500 505 510 Gln Gly Arg ThrAsp Ala Ala Asp Pro Gly Ala Val Ala Arg Val Leu 515 520 525 Ala Gly GlyArg Ala Glu Phe Glu His Arg Ala Val Val Leu Gly Thr 530 535 540 Gly GlnAsp Asp Phe Ala Gln Ala Leu Thr Ala Pro Glu Gly Leu Ile 545 550 555 560Arg Gly Thr Pro Ser Asp Val Gly Arg Val Ala Phe Val Phe Pro Gly 565 570575 Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Val Ser 580585 590 Lys Glu Phe Ala Ala Ala Met Ala Glu Cys Glu Ser Ala Leu Ser Arg595 600 605 Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro GlyAla 610 615 620 Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro Val Thr PheAla Val 625 630 635 640 Met Val Ser Leu Ala Lys Val Trp Gln His His GlyVal Thr Pro Gln 645 650 655 Ala Val Val Gly His Ser Gln Gly Glu Ile AlaAla Ala Tyr Val Ala 660 665 670 Gly Ala Leu Thr Leu Asp Asp Ala Ala ArgVal Val Thr Leu Arg Ser 675 680 685 Lys Ser Ile Ala Ala His Leu Ala GlyLys Gly Gly Met Ile Ser Leu 690 695 700 Ala Leu Ser Glu Glu Ala Thr ArgGln Arg Ile Glu Asn Leu His Gly 705 710 715 720 Leu Ser Ile Ala Ala ValAsn Gly Pro Thr Ala Thr Val Val Ser Gly 725 730 735 Asp Pro Thr Gln IleGln Glu Leu Ala Gln Ala Cys Glu Ala Asp Gly 740 745 750 Val Arg Ala ArgIle Ile Pro Val Asp Tyr Ala Ser His Ser Ala His 755 760 765 Val Glu ThrIle Glu Ser Glu Leu Ala Glu Val Leu Ala Gly Leu Ser 770 775 780 Pro ArgThr Pro Glu Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp 785 790 795 800Ile Thr Glu Pro Val Leu Asp Gly Thr Tyr Trp Tyr Arg Asn Leu Arg 805 810815 His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu 820825 830 Gly Phe Thr His Phe Ile Glu Val Ser Ala His Pro Val Leu Thr Met835 840 845 Thr Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg GluGln 850 855 860 Gly Gly Gln Glu Arg Leu Val Thr Ser Leu Ala Glu Ala TrpThr Asn 865 870 875 880 Gly Leu Thr Ile Asp Trp Ala Pro Val Leu Pro ThrAla Thr Gly His 885 890 895 His Pro Glu Leu Pro Thr Tyr Ala Phe Gln ArgArg His Tyr Trp Leu 900 905 910 His Asp Ser Pro Ala Val Gln Gly Ser ValGln Asp Ser Trp Arg Tyr 915 920 925 Arg Ile Asp Trp Lys Arg Leu Ala ValAla Asp Ala Ser Glu Arg Ala 930 935 940 Gly Leu Ser Gly Arg Trp Leu ValVal Val Pro Glu Asp Arg Ser Ala 945 950 955 960 Glu Ala Ala Pro Val LeuAla Ala Leu Ser Gly Ala Gly Ala Asp Pro 965 970 975 Val Gln Leu Asp ValSer Pro Leu Gly Asp Arg Gln Arg Leu Ala Ala 980 985 990 Thr Leu Gly GluAla Leu Ala Ala Ala Gly Gly Ala Val Asp Gly Val 995 1000 1005 Leu SerLeu Leu Ala Trp Asp Glu Ser Ala His Pro Gly His Pro Ala 1010 1015 1020Pro Phe Thr Arg Gly Thr Gly Ala Thr Leu Thr Leu Val Gln Ala Leu 10251030 1035 1040 Glu Asp Ala Gly Val Ala Ala Pro Leu Trp Cys Val Thr HisGly Ala 1045 1050 1055 Val Ser Val Gly Arg Ala Asp His Val Thr Ser ProAla Gln Ala Met 1060 1065 1070 Val Trp Gly Met Gly Arg Val Ala Ala LeuGlu His Pro Glu Arg Trp 1075 1080 1085 Gly Gly Leu Ile Asp Leu Pro SerAsp Ala Asp Arg Ala Ala Leu Asp 1090 1095 1100 Arg Met Thr Thr Val LeuAla Gly Gly Thr Gly Glu Asp Gln Val Ala 1105 1110 1115 1120 Val Arg AlaSer Gly Leu Leu Ala Arg Arg Leu Val Arg Ala Ser Leu 1125 1130 1135 ProAla His Gly Thr Ala Ser Pro Trp Trp Gln Ala Asp Gly Thr Val 1140 11451150 Leu Val Thr Gly Ala Glu Glu Pro Ala Ala Ala Glu Ala Ala Arg Arg1155 1160 1165 Leu Ala Arg Asp Gly Ala Gly His Leu Leu Leu His Thr ThrPro Ser 1170 1175 1180 Gly Ser Glu Gly Ala Glu Gly Thr Ser Gly Ala AlaGlu Asp Ser Gly 1185 1190 1195 1200 Leu Ala Gly Leu Val Ala Glu Leu AlaAsp Leu Gly Ala Thr Ala Thr 1205 1210 1215 Val Val Thr Cys Asp Leu ThrAsp Ala Glu Ala Ala Ala Arg Leu Leu 1220 1225 1230 Ala Gly Val Ser AspAla His Pro Leu Ser Ala Val Leu His Leu Pro 1235 1240 1245 Pro Thr ValAsp Ser Glu Pro Leu Ala Ala Thr Asp Ala Asp Ala Leu 1250 1255 1260 AlaArg Val Val Thr Ala Lys Ala Thr Ala Ala Leu His Leu Asp Arg 1265 12701275 1280 Leu Leu Arg Glu Ala Ala Ala Ala Gly Gly Arg Pro Pro Val LeuVal 1285 1290 1295 Leu Phe Ser Ser Val Ala Ala Ile Trp Gly Gly Ala GlyGln Gly Ala 1300 1305 1310 Tyr Ala Ala Gly Thr Ala Phe Leu Asp Ala LeuAla Gly Gln His Arg 1315 1320 1325 Ala Asp Gly Pro Thr Val Thr Ser ValAla Trp Ser Pro Trp Glu Gly 1330 1335 1340 Ser Arg Val Thr Glu Gly AlaThr Gly Glu Arg Leu Arg Arg Leu Gly 1345 1350 1355 1360 Leu Arg Pro LeuAla Pro Ala Thr Ala Leu Thr Ala Leu Asp Thr Ala 1365 1370 1375 Leu GlyHis Gly Asp Thr Ala Val Thr Ile Ala Asp Val Asp Trp Ser 1380 1385 1390Ser Phe Ala Pro Gly Phe Thr Thr Ala Arg Pro Gly Thr Leu Leu Ala 13951400 1405 Asp Leu Pro Glu Ala Arg Arg Ala Leu Asp Glu Gln Gln Ser ThrThr 1410 1415 1420 Ala Ala Asp Asp Thr Val Leu Ser Arg Glu Leu Gly AlaLeu Thr Gly 1425 1430 1435 1440 Ala Glu Gln Gln Arg Arg Met Gln Glu LeuVal Arg Glu His Leu Ala 1445 1450 1455 Val Val Leu Asn His Pro Ser ProGlu Ala Val Asp Thr Gly Arg Ala 1460 1465 1470 Phe Arg Asp Leu Gly PheAsp Ser Leu Thr Ala Val Glu Leu Arg Asn 1475 1480 1485 Arg Leu Lys AsnAla Thr Gly Leu Ala Leu Pro Ala Thr Leu Val Phe 1490 1495 1500 Asp TyrPro Thr Pro Arg Thr Leu Ala Glu Phe Leu Leu Ala Glu Ile 1505 1510 15151520 Leu Gly Glu Gln Ala Gly Ala Gly Glu Gln Leu Pro Val Asp Gly Gly1525 1530 1535 Val Asp Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys ArgLeu Pro 1540 1545 1550 Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg LeuVal Ala Gly Gly 1555 1560 1565 Glu Asp Ala Ile Ser Gly Phe Pro Gln AspArg Gly Trp Asp Val Glu 1570 1575 1580 Gly Leu Tyr Asp Pro Asp Pro AspAla Ser Gly Arg Thr Tyr Cys Arg 1585 1590 1595 1600 Ala Gly Gly Phe LeuAsp Glu Ala Gly Glu Phe Asp Ala Asp Phe Phe 1605 1610 1615 Gly Ile SerPro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu 1620 1625 1630 LeuLeu Glu Thr Ser Trp Glu Ala Val Glu Asp Ala Gly Ile Asp Pro 1635 16401645 Thr Ser Leu Gln Gly Gln Gln Val Gly Val Phe Ala Gly Thr Asn Gly1650 1655 1660 Pro His Tyr Glu Pro Leu Leu Arg Asn Thr Ala Glu Asp LeuGlu Gly 1665 1670 1675 1680 Tyr Val Gly Thr Gly Asn Ala Ala Ser Ile MetSer Gly Arg Val Ser 1685 1690 1695 Tyr Thr Leu Gly Leu Glu Gly Pro AlaVal Thr Val Asp Thr Ala Cys 1700 1705 1710 Ser Ser Ser Leu Val Ala LeuHis Leu Ala Val Gln Ala Leu Arg Lys 1715 1720 1725 Gly Glu Cys Gly LeuAla Leu Ala Gly Gly Val Thr Val Met Ser Thr 1730 1735 1740 Pro Thr ThrPhe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Glu Asp 1745 1750 1755 1760Gly Arg Ser Lys Ala Phe Ala Ala Ser Ala Asp Gly Phe Gly Pro Ala 17651770 1775 Glu Gly Val Gly Met Leu Leu Val Glu Arg Leu Ser Asp Ala ArgArg 1780 1785 1790 Asn Gly His Arg Val Leu Ala Val Val Arg Gly Ser AlaVal Asn Gln 1795 1800 1805 Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro AsnGly Pro Ser Gln Gln 1810 1815 1820 Arg Val Ile Arg Arg Ala Leu Ala AspAla Arg Leu Thr Thr Ala Asp 1825 1830 1835 1840 Val Asp Val Val Glu AlaHis Gly Thr Gly Thr Arg Leu Gly Asp Pro 1845 1850 1855 Ile Glu Ala GlnAla Leu Ile Ala Thr Tyr Gly Gln Gly Arg Asp Thr 1860 1865 1870 Glu GlnPro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His Thr 1875 1880 1885Gln Ala Ala Ala Gly Val Ser Gly Ile Ile Lys Met Val Gln Ala Met 18901895 1900 Arg His Gly Val Leu Pro Lys Thr Leu His Val Asp Arg Pro SerAsp 1905 1910 1915 1920 Gln Ile Asp Trp Ser Ala Gly Thr Val Glu Leu LeuThr Glu Ala Met 1925 1930 1935 Asp Trp Pro Arg Lys Gln Glu Gly Gly LeuArg Arg Ala Ala Val Ser 1940 1945 1950 Ser Phe Gly Ile Ser Gly Thr AsnAla His Ile Val Leu Glu Glu Ala 1955 1960 1965 Pro Val Asp Glu Asp AlaPro Ala Asp Glu Pro Ser Val Gly Gly Val 1970 1975 1980 Val Pro Trp LeuVal Ser Ala Lys Thr Pro Ala Ala Leu Asp Ala Gln 1985 1990 1995 2000 IleGly Arg Leu Ala Ala Phe Ala Ser Gln Gly Arg Thr Asp Ala Ala 2005 20102015 Asp Pro Gly Ala Val Ala Arg Val Leu Ala Gly Gly Arg Ala Gln Phe2020 2025 2030 Glu His Arg Ala Val Ala Leu Gly Thr Gly Gln Asp Asp LeuAla Ala 2035 2040 2045 Ala Leu Ala Ala Pro Glu Gly Leu Val Arg Gly ValAla Ser Gly Val 2050 2055 2060 Gly Arg Val Ala Phe Val Phe Pro Gly GlnGly Thr Gln Trp Ala Gly 2065 2070 2075 2080 Met Gly Ala Glu Leu Leu AspVal Ser Lys Glu Phe Ala Ala Ala Met 2085 2090 2095 Ala Glu Cys Glu AlaAla Leu Ala Pro Tyr Val Asp Trp Ser Leu Glu 2100 2105 2110 Ala Val ValArg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp 2115 2120 2125 ValVal Gln Pro Val Thr Phe Ala Val Met Val Ser Leu Ala Lys Val 2130 21352140 Trp Gln His His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln2145 2150 2155 2160 Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu SerLeu Asp Asp 2165 2170 2175 Ala Ala Arg Val Val Thr Leu Arg Ser Lys SerIle Gly Ala His Leu 2180 2185 2190 Ala Gly Gln Gly Gly Met Leu Ser LeuAla Leu Ser Glu Ala Ala Val 2195 2200 2205 Val Glu Arg Leu Ala Gly PheAsp Gly Leu Ser Val Ala Ala Val Asn 2210 2215 2220 Gly Pro Thr Ala ThrVal Val Ser Gly Asp Pro Thr Gln Ile Gln Glu 2225 2230 2235 2240 Leu AlaGln Ala Cys Glu Ala Asp Gly Val Arg Ala Arg Ile Ile Pro 2245 2250 2255Val Asp Tyr Ala Ser His Ser Ala His Val Glu Thr Ile Glu Ser Glu 22602265 2270 Leu Ala Asp Val Leu Ala Gly Leu Ser Pro Gln Thr Pro Gln ValPro 2275 2280 2285 Phe Phe Ser Thr Leu Glu Gly Ala Trp Ile Thr Glu ProAla Leu Asp 2290 2295 2300 Gly Gly Tyr Trp Tyr Arg Asn Leu Arg His ArgVal Gly Phe Ala Pro 2305 2310 2315 2320 Ala Val Glu Thr Leu Ala Thr AspGlu Gly Phe Thr His Phe Val Glu 2325 2330 2335 Val Ser Ala His Pro ValLeu Thr Met Ala Leu Pro Glu Thr Val Thr 2340 2345 2350 Gly Leu Gly ThrLeu Arg Arg Asp Asn Gly Gly Gln His Arg Leu Thr 2355 2360 2365 Thr SerLeu Ala Glu Ala Trp Ala Asn Gly Leu Thr Val Asp Trp Ala 2370 2375 2380Ser Leu Leu Pro Thr Thr Thr Thr His Pro Asp Leu Pro Thr Tyr Ala 23852390 2395 2400 Phe Gln Thr Glu Arg Tyr Trp Pro Gln Pro Asp Leu Ser AlaAla Gly 2405 2410 2415 Asp Ile Thr Ser Ala Gly Leu Gly Ala Ala Glu HisPro Leu Leu Gly 2420 2425 2430 Ala Ala Val Ala Leu Ala Asp Ser Asp GlyCys Leu Leu Thr Gly Ser 2435 2440 2445 Leu Ser Leu Arg Thr His Pro TrpLeu Ala Asp His Ala Val Ala Gly 2450 2455 2460 Thr Val Leu Leu Pro GlyThr Ala Phe Val Glu Leu Ala Phe Arg Ala 2465 2470 2475 2480 Gly Asp GlnVal Gly Cys Asp Leu Val Glu Glu Leu Thr Leu Asp Ala 2485 2490 2495 ProLeu Val Leu Pro Arg Arg Gly Ala Val Arg Val Gln Leu Ser Val 2500 25052510 Gly Ala Ser Asp Glu Ser Gly Arg Arg Thr Phe Gly Leu Tyr Ala His2515 2520 2525 Pro Glu Asp Ala Pro Gly Glu Ala Glu Trp Thr Arg His AlaThr Gly 2530 2535 2540 Val Leu Ala Ala Arg Ala Asp Arg Thr Ala Pro ValAla Asp Pro Glu 2545 2550 2555 2560 Ala Trp Pro Pro Pro Gly Ala Glu ProVal Asp Val Asp Gly Leu Tyr 2565 2570 2575 Glu Arg Phe Ala Ala Asn GlyTyr Gly Tyr Gly Pro Leu Phe Gln Gly 2580 2585 2590 Val Arg Gly Val TrpArg Arg Gly Asp Glu Val Phe Ala Asp Val Ala 2595 2600 2605 Leu Pro AlaGlu Val Ala Gly Ala Glu Gly Ala Arg Phe Gly Leu His 2610 2615 2620 ProAla Leu Leu Asp Ala Ala Val Gln Ala Ala Gly Ala Gly Arg Gly 2625 26302635 2640 Val Arg Arg Gly His Ala Ala Ala Val Arg Leu Glu Arg Asp LeuLeu 2645 2650 2655 Tyr Ala Val Gly Ala Thr Ala Leu Arg Val Arg Leu AlaPro Ala Gly 2660 2665 2670 Pro Asp Thr Val Ser Val Ser Ala Ala Asp SerSer Gly Gln Pro Val 2675 2680 2685 Phe Ala Ala Asp Ser Leu Thr Val LeuPro Val Asp Pro Ala Gln Leu 2690 2695 2700 Ala Ala Phe Ser Asp Pro ThrLeu Asp Ala Leu His Leu Leu Glu Trp 2705 2710 2715 2720 Thr Ala Trp AspGly Ala Ala Gln Ala Leu Pro Gly Ala Val Val Leu 2725 2730 2735 Gly GlyAsp Ala Asp Gly Leu Ala Ala Ala Leu Arg Ala Gly Gly Thr 2740 2745 2750Glu Val Leu Ser Phe Pro Asp Leu Thr Asp Leu Val Glu Ala Val Asp 27552760 2765 Arg Gly Glu Thr Pro Ala Pro Ala Thr Val Leu Val Ala Cys ProAla 2770 2775 2780 Ala Gly Pro Asp Gly Pro Glu His Val Arg Glu Ala LeuHis Gly Ser 2785 2790 2795 2800 Leu Ala Leu Met Gln Ala Trp Leu Ala AspGlu Arg Phe Thr Asp Gly 2805 2810 2815 Arg Leu Val Leu Val Thr Arg AspAla Val Ala Ala Arg Ser Gly Asp 2820 2825 2830 Gly Leu Arg Ser Thr GlyGln Ala Ala Val Trp Gly Leu Gly Arg Ser 2835 2840 2845 Ala Gln Thr GluSer Pro Gly Arg Phe Val Leu Leu Asp Leu Ala Gly 2850 2855 2860 Glu AlaArg Thr Ala Gly Asp Ala Thr Ala Gly Asp Gly Leu Thr Thr 2865 2870 28752880 Gly Asp Ala Thr Val Gly Gly Thr Ser Gly Asp Ala Ala Leu Gly Ser2885 2890 2895 Ala Leu Ala Thr Ala Leu Gly Ser Gly Glu Pro Gln Leu AlaLeu Arg 2900 2905 2910 Asp Gly Ala Leu Leu Val Pro Arg Leu Ala Arg AlaAla Ala Pro Ala 2915 2920 2925 Ala Ala Asp Gly Leu Ala Ala Ala Asp GlyLeu Ala Ala Leu Pro Leu 2930 2935 2940 Pro Ala Ala Pro Ala Leu Trp ArgLeu Glu Pro Gly Thr Asp Gly Ser 2945 2950 2955 2960 Leu Glu Ser Leu ThrAla Ala Pro Gly Asp Ala Glu Thr Leu Ala Pro 2965 2970 2975 Glu Pro LeuGly Pro Gly Gln Val Arg Ile Ala Ile Arg Ala Thr Gly 2980 2985 2990 LeuAsn Phe Arg Asp Val Leu Ile Ala Leu Gly Met Tyr Pro Asp Pro 2995 30003005 Ala Leu Met Gly Thr Glu Gly Ala Gly Val Val Thr Ala Thr Gly Pro3010 3015 3020 Gly Val Thr His Leu Ala Pro Gly Asp Arg Val Met Gly LeuLeu Ser 3025 3030 3035 3040 Gly Ala Tyr Ala Pro Val Val Val Ala Asp AlaArg Thr Val Ala Arg 3045 3050 3055 Met Pro Glu Gly Trp Thr Phe Ala GlnGly Ala Ser Val Pro Val Val 3060 3065 3070 Phe Leu Thr Ala Val Tyr AlaLeu Arg Asp Leu Ala Asp Val Lys Pro 3075 3080 3085 Gly Glu Arg Leu LeuVal His Ser Ala Ala Gly Gly Val Gly Met Ala 3090 3095 3100 Ala Val GlnLeu Ala Arg His Trp Gly Val Glu Val His Gly Thr Ala 3105 3110 3115 3120Ser His Gly Lys Trp Asp Ala Leu Arg Ala Leu Gly Leu Asp Asp Ala 31253130 3135 His Ile Ala Ser Ser Arg Thr Leu Asp Phe Glu Ser Ala Phe ArgAla 3140 3145 3150 Ala Ser Gly Gly Ala Gly Met Asp Val Val Leu Asn SerLeu Ala Arg 3155 3160 3165 Glu Phe Val Asp Ala Ser Leu Arg Leu Leu GlyPro Gly Gly Arg Phe 3170 3175 3180 Val Glu Met Gly Lys Thr Asp Val ArgAsp Ala Glu Arg Val Ala Ala 3185 3190 3195 3200 Asp His Pro Gly Val GlyTyr Arg Ala Phe Asp Leu Gly Glu Ala Gly 3205 3210 3215 Pro Glu Arg IleGly Glu Met Leu Ala Glu Val Ile Ala Leu Phe Glu 3220 3225 3230 Asp GlyVal Leu Arg His Leu Pro Val Thr Thr Trp Asp Val Arg Arg 3235 3240 3245Ala Arg Asp Ala Phe Arg His Val Ser Gln Ala Arg His Thr Gly Lys 32503255 3260 Val Val Leu Thr Met Pro Ser Gly Leu Asp Pro Glu Gly Thr ValLeu 3265 3270 3275 3280 Leu Thr Gly Gly Thr Gly Ala Leu Gly Gly Ile ValAla Arg His Val 3285 3290 3295 Val Gly Glu Trp Gly Val Arg Arg Leu LeuLeu Val Ser Arg Arg Gly 3300 3305 3310 Thr Asp Ala Pro Gly Ala Gly GluLeu Val His Glu Leu Glu Ala Leu 3315 3320 3325 Gly Ala Asp Val Ser ValAla Ala Cys Asp Val Ala Asp Arg Glu Ala 3330 3335 3340 Leu Thr Ala ValLeu Asp Ser Ile Pro Ala Glu His Pro Leu Thr Ala 3345 3350 3355 3360 ValVal His Thr Ala Gly Val Leu Ser Asp Gly Thr Leu Pro Ser Met 3365 33703375 Thr Ala Glu Asp Val Glu His Val Leu Arg Pro Lys Val Asp Ala Ala3380 3385 3390 Phe Leu Leu Asp Glu Leu Thr Ser Thr Pro Gly Tyr Asp LeuAla Ala 3395 3400 3405 Phe Val Met Phe Ser Ser Ala Ala Ala Val Phe GlyGly Ala Gly Gln 3410 3415 3420 Gly Ala Tyr Ala Ala Ala Asn Ala Thr LeuAsp Ala Leu Ala Trp Arg 3425 3430 3435 3440 Arg Arg Thr Ala Gly Leu ProAla Leu Ser Leu Gly Trp Gly Leu Trp 3445 3450 3455 Ala Glu Thr Ser GlyMet Thr Gly Gly Leu Ser Asp Thr Asp Arg Ser 3460 3465 3470 Arg Leu AlaArg Ser Gly Ala Thr Pro Met Asp Ser Glu Leu Thr Leu 3475 3480 3485 SerLeu Leu Asp Ala Ala Met Arg Arg Asp Asp Pro Ala Leu Val Pro 3490 34953500 Ile Ala Leu Asp Val Ala Ala Leu Arg Ala Gln Gln Arg Asp Gly Met3505 3510 3515 3520 Leu Ala Pro Leu Leu Ser Gly Leu Thr Arg Gly Ser ArgVal Gly Gly 3525 3530 3535 Ala Pro Val Asn Gln Arg Arg Ala Ala Ala GlyGly Ala Gly Glu Ala 3540 3545 3550 Asp Thr Asp Leu Gly Gly Arg Leu AlaAla Met Thr Pro Asp Asp Arg 3555 3560 3565 Val Ala His Leu Arg Asp LeuVal Arg Thr His Val Ala Thr Val Leu 3570 3575 3580 Gly His Gly Thr ProSer Arg Val Asp Leu Glu Arg Ala Phe Arg Asp 3585 3590 3595 3600 Thr GlyPhe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Asn 3605 3610 3615Ala Ala Thr Gly Leu Arg Leu Pro Ala Thr Leu Val Phe Asp His Pro 36203625 3630 Thr Pro Gly Glu Leu Ala Gly His Leu Leu Asp Glu Leu Ala ThrAla 3635 3640 3645 Ala Gly Gly Ser Trp Ala Glu Gly Thr Gly Ser Gly AspThr Ala Ser 3650 3655 3660 Ala Thr Asp Arg Gln Thr Thr Ala Ala Leu AlaGlu Leu Asp Arg Leu 3665 3670 3675 3680 Glu Gly Val Leu Ala Ser Leu AlaPro Ala Ala Gly Gly Arg Pro Glu 3685 3690 3695 Leu Ala Ala Arg Leu ArgAla Leu Ala Ala Ala Leu Gly Asp Asp Gly 3700 3705 3710 Asp Asp Ala ThrAsp Leu Asp Glu Ala Ser Asp Asp Asp Leu Phe Ser 3715 3720 3725 Phe IleAsp Lys Glu Leu Gly Asp Ser Asp Phe 3730 3735 34 4689 DNA Streptomycesvenezuelae 34 atggcgaaca acgaagacaa gctccgcgac tacctcaagc gcgtcaccgccgagctgcag 60 cagaacacca ggcgtctgcg cgagatcgag ggacgcacgc acgagccggtggcgatcgtg 120 ggcatggcct gccgcctgcc gggcggtgtc gcctcgcccg aggacctgtggcagctggtg 180 gccggggacg gggacgcgat ctcggagttc ccgcaggacc gcggctgggacgtggagggg 240 ctgtacgacc ccgacccgga cgcgtccggc aggacgtact gccggtccggcggattcctg 300 cacgacgccg gcgagttcga cgccgacttc ttcgggatct cgccgcgcgaggccctcgcc 360 atggacccgc agcagcgact gtccctcacc accgcgtggg aggcgatcgagagcgcgggc 420 atcgacccga cggccctgaa gggcagcggc ctcggcgtct tcgtcggcggctggcacacc 480 ggctacacct cggggcagac caccgccgtg cagtcgcccg agctggagggccacctggtc 540 agcggcgcgg cgctgggctt cctgtccggc cgtatcgcgt acgtcctcggtacggacgga 600 ccggccctga ccgtggacac ggcctgctcg tcctcgctgg tcgccctgcacctcgccgtg 660 caggccctcc gcaagggcga gtgcgacatg gccctcgccg gtggtgtcacggtcatgccc 720 aacgcggacc tgttcgtgca gttcagccgg cagcgcgggc tggccgcggacggccggtcg 780 aaggcgttcg ccacctcggc ggacggcttc ggccccgcgg agggcgccggagtcctgctg 840 gtggagcgcc tgtcggacgc ccgccgcaac ggacaccgga tcctcgcggtcgtccgcggc 900 agcgcggtca accaggacgg cgccagcaac ggcctcacgg ctccgcacgggccctcccag 960 cagcgcgtca tccgacgggc cctggcggac gcccggctcg cgccgggtgacgtggacgtc 1020 gtcgaggcgc acggcacggg cacgcggctc ggcgacccga tcgaggcgcaggccctcatc 1080 gccacctacg gccaggagaa gagcagcgaa cagccgctga ggctgggcgcgttgaagtcg 1140 aacatcgggc acacgcaggc cgcggccggt gtcgcaggtg tcatcaagatggtccaggcg 1200 atgcgccacg gactgctgcc gaagacgctg cacgtcgacg agccctcggaccagatcgac 1260 tggtcggcgg gcacggtgga actcctcacc gaggccgtcg actggccggagaagcaggac 1320 ggcgggctgc gccgcgcggc tgtctcctcc ttcggcatca gcgggacgaacgcgcacgtc 1380 gtcctggagg aggccccggc ggtcgaggac tccccggccg tcgagccgccggccggtggc 1440 ggtgtggtgc cgtggccggt gtccgcgaag actccggccg cgctggacgcccagatcggg 1500 cagctcgccg cgtacgcgga cggtcgtacg gacgtggatc cggcggtggccgcccgcgcc 1560 ctggtcgaca gccgtacggc gatggagcac cgcgcggtcg cggtcggcgacagccgggag 1620 gcactgcggg acgccctgcg gatgccggaa ggactggtac gcggcacgtcctcggacgtg 1680 ggccgggtgg cgttcgtctt ccccggccag ggcacgcagt gggccggcatgggcgccgaa 1740 ctccttgaca gctcaccgga gttcgctgcc tcgatggccg aatgcgagaccgcgctctcc 1800 cgctacgtcg actggtctct tgaagccgtc gtccgacagg aacccggcgcacccacgctc 1860 gaccgcgtcg acgtcgtcca gcccgtgacc ttcgctgtca tggtctcgctggcgaaggtc 1920 tggcagcacc acggcatcac cccccaggcc gtcgtcggcc actcgcagggcgagatcgcc 1980 gccgcgtacg tcgccggtgc actcaccctc gacgacgccg cccgcgtcgtcaccctgcgc 2040 agcaagtcca tcgccgccca cctcgccggc aagggcggca tgatctccctcgccctcgac 2100 gaggcggccg tcctgaagcg actgagcgac ttcgacggac tctccgtcgccgccgtcaac 2160 ggccccaccg ccaccgtcgt ctccggcgac ccgacccaga tcgaggaactcgcccgcacc 2220 tgcgaggccg acggcgtccg tgcgcggatc atcccggtcg actacgcctcccacagccgg 2280 caggtcgaga tcatcgagaa ggagctggcc gaggtcctcg ccggactcgccccgcaggct 2340 ccgcacgtgc cgttcttctc caccctcgaa ggcacctgga tcaccgagccggtgctcgac 2400 ggcacctact ggtaccgcaa cctgcgccat cgcgtgggct tcgcccccgccgtggagacc 2460 ttggcggttg acggcttcac ccacttcatc gaggtcagcg cccaccccgtcctcaccatg 2520 accctccccg agaccgtcac cggcctcggc accctccgcc gcgaacagggaggccaggag 2580 cgtctggtca cctcactcgc cgaagcctgg gccaacggcc tcaccatcgactgggcgccc 2640 atcctcccca ccgcaaccgg ccaccacccc gagctcccca cctacgccttccagaccgag 2700 cgcttctggc tgcagagctc cgcgcccacc agcgccgccg acgactggcgttaccgcgtc 2760 gagtggaagc cgctgacggc ctccggccag gcggacctgt ccgggcggtggatcgtcgcc 2820 gtcgggagcg agccagaagc cgagctgctg ggcgcgctga aggccgcgggagcggaggtc 2880 gacgtactgg aagccggggc ggacgacgac cgtgaggccc tcgccgcccggctcaccgca 2940 ctgacgaccg gcgacggctt caccggcgtg gtctcgctcc tcgacgacctcgtgccacag 3000 gtcgcctggg tgcaggcact cggcgacgcc ggaatcaagg cgcccctgtggtccgtcacc 3060 cagggcgcgg tctccgtcgg acgtctcgac acccccgccg accccgaccgggccatgctc 3120 tggggcctcg gccgcgtcgt cgcccttgag caccccgaac gctgggccggcctcgtcgac 3180 ctccccgccc agcccgatgc cgccgccctc gcccacctcg tcaccgcactctccggcgcc 3240 accggcgagg accagatcgc catccgcacc accggactcc acgcccgccgcctcgcccgc 3300 gcacccctcc acggacgtcg gcccacccgc gactggcagc cccacggcaccgtcctcatc 3360 accggcggca ccggagccct cggcagccac gccgcacgct ggatggcccaccacggagcc 3420 gaacacctcc tcctcgtcag ccgcagcggc gaacaagccc ccggagccacccaactcacc 3480 gccgaactca ccgcatcggg cgcccgcgtc accatcgccg cctgcgacgtcgccgacccc 3540 cacgccatgc gcaccctcct cgacgccatc cccgccgaga cgcccctcaccgccgtcgtc 3600 cacaccgccg gcgcaccggg cggcgatccg ctggacgtca ccggcccggaggacatcgcc 3660 cgcatcctgg gcgcgaagac gagcggcgcc gaggtcctcg acgacctgctccgcggcact 3720 ccgctggacg ccttcgtcct ctactcctcg aacgccgggg tctggggcagcggcagccag 3780 ggcgtctacg cggcggccaa cgcccacctc gacgcgctcg ccgcccggcgccgcgcccgg 3840 ggcgagacgg cgacctcggt cgcctggggc ctctgggccg gcgacggcatgggccggggc 3900 gccgacgacg cgtactggca gcgtcgcggc atccgtccga tgagccccgaccgcgccctg 3960 gacgaactgg ccaaggccct gagccacgac gagaccttcg tcgccgtggccgatgtcgac 4020 tgggagcggt tcgcgcccgc gttcacggtg tcccgtccca gccttctgctcgacggcgtc 4080 ccggaggccc ggcaggcgct cgccgcaccc gtcggtgccc cggctcccggcgacgccgcc 4140 gtggcgccga ccgggcagtc gtcggcgctg gccgcgatca ccgcgctccccgagcccgag 4200 cgccggccgg cgctcctcac cctcgtccgt acccacgcgg cggccgtactcggccattcc 4260 tcccccgacc gggtggcccc cggccgtgcc ttcaccgagc tcggcttcgactcgctgacg 4320 gccgtgcagc tccgcaacca gctctccacg gtggtcggca acaggctccccgccaccacg 4380 gtcttcgacc acccgacgcc cgccgcactc gccgcgcacc tccacgaggcgtacctcgca 4440 ccggccgagc cggccccgac ggactgggag gggcgggtgc gccgggccctggccgaactg 4500 cccctcgacc ggctgcggga cgcgggggtc ctcgacaccg tcctgcgcctcaccggcatc 4560 gagcccgagc cgggttccgg cggttcggac ggcggcgccg ccgaccctggtgcggagccg 4620 gaggcgtcga tcgacgacct ggacgccgag gccctgatcc ggatggctctcggcccccgt 4680 aacacctga 4689 35 1562 PRT Streptomyces venezuelae 35Met Ala Asn Asn Glu Asp Lys Leu Arg Asp Tyr Leu Lys Arg Val Thr 1 5 1015 Ala Glu Leu Gln Gln Asn Thr Arg Arg Leu Arg Glu Ile Glu Gly Arg 20 2530 Thr His Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly 35 4045 Gly Val Ala Ser Pro Glu Asp Leu Trp Gln Leu Val Ala Gly Asp Gly 50 5560 Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val Glu Gly 65 7075 80 Leu Tyr Asp Pro Asp Pro Asp Ala Ser Gly Arg Thr Tyr Cys Arg Ser 8590 95 Gly Gly Phe Leu His Asp Ala Gly Glu Phe Asp Ala Asp Phe Phe Gly100 105 110 Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg LeuSer 115 120 125 Leu Thr Thr Ala Trp Glu Ala Ile Glu Ser Ala Gly Ile AspPro Thr 130 135 140 Ala Leu Lys Gly Ser Gly Leu Gly Val Phe Val Gly GlyTrp His Thr 145 150 155 160 Gly Tyr Thr Ser Gly Gln Thr Thr Ala Val GlnSer Pro Glu Leu Glu 165 170 175 Gly His Leu Val Ser Gly Ala Ala Leu GlyPhe Leu Ser Gly Arg Ile 180 185 190 Ala Tyr Val Leu Gly Thr Asp Gly ProAla Leu Thr Val Asp Thr Ala 195 200 205 Cys Ser Ser Ser Leu Val Ala LeuHis Leu Ala Val Gln Ala Leu Arg 210 215 220 Lys Gly Glu Cys Asp Met AlaLeu Ala Gly Gly Val Thr Val Met Pro 225 230 235 240 Asn Ala Asp Leu PheVal Gln Phe Ser Arg Gln Arg Gly Leu Ala Ala 245 250 255 Asp Gly Arg SerLys Ala Phe Ala Thr Ser Ala Asp Gly Phe Gly Pro 260 265 270 Ala Glu GlyAla Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg 275 280 285 Arg AsnGly His Arg Ile Leu Ala Val Val Arg Gly Ser Ala Val Asn 290 295 300 GlnAsp Gly Ala Ser Asn Gly Leu Thr Ala Pro His Gly Pro Ser Gln 305 310 315320 Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Ala Pro Gly 325330 335 Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp340 345 350 Pro Ile Glu Ala Gln Ala Leu Ile Ala Thr Tyr Gly Gln Glu LysSer 355 360 365 Ser Glu Gln Pro Leu Arg Leu Gly Ala Leu Lys Ser Asn IleGly His 370 375 380 Thr Gln Ala Ala Ala Gly Val Ala Gly Val Ile Lys MetVal Gln Ala 385 390 395 400 Met Arg His Gly Leu Leu Pro Lys Thr Leu HisVal Asp Glu Pro Ser 405 410 415 Asp Gln Ile Asp Trp Ser Ala Gly Thr ValGlu Leu Leu Thr Glu Ala 420 425 430 Val Asp Trp Pro Glu Lys Gln Asp GlyGly Leu Arg Arg Ala Ala Val 435 440 445 Ser Ser Phe Gly Ile Ser Gly ThrAsn Ala His Val Val Leu Glu Glu 450 455 460 Ala Pro Ala Val Glu Asp SerPro Ala Val Glu Pro Pro Ala Gly Gly 465 470 475 480 Gly Val Val Pro TrpPro Val Ser Ala Lys Thr Pro Ala Ala Leu Asp 485 490 495 Ala Gln Ile GlyGln Leu Ala Ala Tyr Ala Asp Gly Arg Thr Asp Val 500 505 510 Asp Pro AlaVal Ala Ala Arg Ala Leu Val Asp Ser Arg Thr Ala Met 515 520 525 Glu HisArg Ala Val Ala Val Gly Asp Ser Arg Glu Ala Leu Arg Asp 530 535 540 AlaLeu Arg Met Pro Glu Gly Leu Val Arg Gly Thr Ser Ser Asp Val 545 550 555560 Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly 565570 575 Met Gly Ala Glu Leu Leu Asp Ser Ser Pro Glu Phe Ala Ala Ser Met580 585 590 Ala Glu Cys Glu Thr Ala Leu Ser Arg Tyr Val Asp Trp Ser LeuGlu 595 600 605 Ala Val Val Arg Gln Glu Pro Gly Ala Pro Thr Leu Asp ArgVal Asp 610 615 620 Val Val Gln Pro Val Thr Phe Ala Val Met Val Ser LeuAla Lys Val 625 630 635 640 Trp Gln His His Gly Ile Thr Pro Gln Ala ValVal Gly His Ser Gln 645 650 655 Gly Glu Ile Ala Ala Ala Tyr Val Ala GlyAla Leu Thr Leu Asp Asp 660 665 670 Ala Ala Arg Val Val Thr Leu Arg SerLys Ser Ile Ala Ala His Leu 675 680 685 Ala Gly Lys Gly Gly Met Ile SerLeu Ala Leu Asp Glu Ala Ala Val 690 695 700 Leu Lys Arg Leu Ser Asp PheAsp Gly Leu Ser Val Ala Ala Val Asn 705 710 715 720 Gly Pro Thr Ala ThrVal Val Ser Gly Asp Pro Thr Gln Ile Glu Glu 725 730 735 Leu Ala Arg ThrCys Glu Ala Asp Gly Val Arg Ala Arg Ile Ile Pro 740 745 750 Val Asp TyrAla Ser His Ser Arg Gln Val Glu Ile Ile Glu Lys Glu 755 760 765 Leu AlaGlu Val Leu Ala Gly Leu Ala Pro Gln Ala Pro His Val Pro 770 775 780 PhePhe Ser Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val Leu Asp 785 790 795800 Gly Thr Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro 805810 815 Ala Val Glu Thr Leu Ala Val Asp Gly Phe Thr His Phe Ile Glu Val820 825 830 Ser Ala His Pro Val Leu Thr Met Thr Leu Pro Glu Thr Val ThrGly 835 840 845 Leu Gly Thr Leu Arg Arg Glu Gln Gly Gly Gln Glu Arg LeuVal Thr 850 855 860 Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu Thr Ile AspTrp Ala Pro 865 870 875 880 Ile Leu Pro Thr Ala Thr Gly His His Pro GluLeu Pro Thr Tyr Ala 885 890 895 Phe Gln Thr Glu Arg Phe Trp Leu Gln SerSer Ala Pro Thr Ser Ala 900 905 910 Ala Asp Asp Trp Arg Tyr Arg Val GluTrp Lys Pro Leu Thr Ala Ser 915 920 925 Gly Gln Ala Asp Leu Ser Gly ArgTrp Ile Val Ala Val Gly Ser Glu 930 935 940 Pro Glu Ala Glu Leu Leu GlyAla Leu Lys Ala Ala Gly Ala Glu Val 945 950 955 960 Asp Val Leu Glu AlaGly Ala Asp Asp Asp Arg Glu Ala Leu Ala Ala 965 970 975 Arg Leu Thr AlaLeu Thr Thr Gly Asp Gly Phe Thr Gly Val Val Ser 980 985 990 Leu Leu AspAsp Leu Val Pro Gln Val Ala Trp Val Gln Ala Leu Gly 995 1000 1005 AspAla Gly Ile Lys Ala Pro Leu Trp Ser Val Thr Gln Gly Ala Val 1010 10151020 Ser Val Gly Arg Leu Asp Thr Pro Ala Asp Pro Asp Arg Ala Met Leu1025 1030 1035 1040 Trp Gly Leu Gly Arg Val Val Ala Leu Glu His Pro GluArg Trp Ala 1045 1050 1055 Gly Leu Val Asp Leu Pro Ala Gln Pro Asp AlaAla Ala Leu Ala His 1060 1065 1070 Leu Val Thr Ala Leu Ser Gly Ala ThrGly Glu Asp Gln Ile Ala Ile 1075 1080 1085 Arg Thr Thr Gly Leu His AlaArg Arg Leu Ala Arg Ala Pro Leu His 1090 1095 1100 Gly Arg Arg Pro ThrArg Asp Trp Gln Pro His Gly Thr Val Leu Ile 1105 1110 1115 1120 Thr GlyGly Thr Gly Ala Leu Gly Ser His Ala Ala Arg Trp Met Ala 1125 1130 1135His His Gly Ala Glu His Leu Leu Leu Val Ser Arg Ser Gly Glu Gln 11401145 1150 Ala Pro Gly Ala Thr Gln Leu Thr Ala Glu Leu Thr Ala Ser GlyAla 1155 1160 1165 Arg Val Thr Ile Ala Ala Cys Asp Val Ala Asp Pro HisAla Met Arg 1170 1175 1180 Thr Leu Leu Asp Ala Ile Pro Ala Glu Thr ProLeu Thr Ala Val Val 1185 1190 1195 1200 His Thr Ala Gly Ala Pro Gly GlyAsp Pro Leu Asp Val Thr Gly Pro 1205 1210 1215 Glu Asp Ile Ala Arg IleLeu Gly Ala Lys Thr Ser Gly Ala Glu Val 1220 1225 1230 Leu Asp Asp LeuLeu Arg Gly Thr Pro Leu Asp Ala Phe Val Leu Tyr 1235 1240 1245 Ser SerAsn Ala Gly Val Trp Gly Ser Gly Ser Gln Gly Val Tyr Ala 1250 1255 1260Ala Ala Asn Ala His Leu Asp Ala Leu Ala Ala Arg Arg Arg Ala Arg 12651270 1275 1280 Gly Glu Thr Ala Thr Ser Val Ala Trp Gly Leu Trp Ala GlyAsp Gly 1285 1290 1295 Met Gly Arg Gly Ala Asp Asp Ala Tyr Trp Gln ArgArg Gly Ile Arg 1300 1305 1310 Pro Met Ser Pro Asp Arg Ala Leu Asp GluLeu Ala Lys Ala Leu Ser 1315 1320 1325 His Asp Glu Thr Phe Val Ala ValAla Asp Val Asp Trp Glu Arg Phe 1330 1335 1340 Ala Pro Ala Phe Thr ValSer Arg Pro Ser Leu Leu Leu Asp Gly Val 1345 1350 1355 1360 Pro Glu AlaArg Gln Ala Leu Ala Ala Pro Val Gly Ala Pro Ala Pro 1365 1370 1375 GlyAsp Ala Ala Val Ala Pro Thr Gly Gln Ser Ser Ala Leu Ala Ala 1380 13851390 Ile Thr Ala Leu Pro Glu Pro Glu Arg Arg Pro Ala Leu Leu Thr Leu1395 1400 1405 Val Arg Thr His Ala Ala Ala Val Leu Gly His Ser Ser ProAsp Arg 1410 1415 1420 Val Ala Pro Gly Arg Ala Phe Thr Glu Leu Gly PheAsp Ser Leu Thr 1425 1430 1435 1440 Ala Val Gln Leu Arg Asn Gln Leu SerThr Val Val Gly Asn Arg Leu 1445 1450 1455 Pro Ala Thr Thr Val Phe AspHis Pro Thr Pro Ala Ala Leu Ala Ala 1460 1465 1470 His Leu His Glu AlaTyr Leu Ala Pro Ala Glu Pro Ala Pro Thr Asp 1475 1480 1485 Trp Glu GlyArg Val Arg Arg Ala Leu Ala Glu Leu Pro Leu Asp Arg 1490 1495 1500 LeuArg Asp Ala Gly Val Leu Asp Thr Val Leu Arg Leu Thr Gly Ile 1505 15101515 1520 Glu Pro Glu Pro Gly Ser Gly Gly Ser Asp Gly Gly Ala Ala AspPro 1525 1530 1535 Gly Ala Glu Pro Glu Ala Ser Ile Asp Asp Leu Asp AlaGlu Ala Leu 1540 1545 1550 Ile Arg Met Ala Leu Gly Pro Arg Asn Thr 15551560 36 4041 DNA Streptomyces venezuelae 36 atgacgagtt ccaacgaacagttggtggac gctctgcgcg cctctctcaa ggagaacgaa 60 gaactccgga aagagagccgtcgccgggcc gaccgtcggc aggagcccat ggcgatcgtc 120 ggcatgagct gccggttcgcgggcggaatc cggtcccccg aggacctctg ggacgccgtc 180 gccgcgggca aggacctggtctccgaggta ccggaggagc gcggctggga catcgactcc 240 ctctacgacc cggtgcccgggcgcaagggc acgacgtacg tccgcaacgc cgcgttcctc 300 gacgacgccg ccggattcgacgcggccttc ttcgggatct cgccgcgcga ggccctcgcc 360 atggacccgc agcagcggcagctcctcgaa gcctcctggg aggtcttcga gcgggccggc 420 atcgaccccg cgtcggtccgcggcaccgac gtcggcgtgt acgtgggctg tggctaccag 480 gactacgcgc cggacatccgggtcgccccc gaaggcaccg gcggttacgt cgtcaccggc 540 aactcctccg ccgtggcctccgggcgcatc gcgtactccc tcggcctgga gggacccgcc 600 gtgaccgtgg acacggcgtgctcctcttcg ctcgtcgccc tgcacctcgc cctgaagggc 660 ctgcggaacg gcgactgctcgacggcactc gtgggcggcg tggccgtcct cgcgacgccg 720 ggcgcgttca tcgagttcagcagccagcag gccatggccg ccgacggccg gaccaagggc 780 ttcgcctcgg cggcggacggcctcgcctgg ggcgagggcg tcgccgtact cctcctcgaa 840 cggctctccg acgcgcggcgcaagggccac cgggtcctgg ccgtcgtgcg cggcagcgcc 900 atcaaccagg acggcgcgagcaacggcctc acggctccgc acgggccctc ccagcagcac 960 ctgatccgcc aggccctggccgacgcgcgg ctcacgtcga gcgacgtgga cgtcgtggag 1020 ggccacggca cggggacccgtctcggcgac ccgatcgagg cgcaggcgct gctcgccacg 1080 tacgggcagg ggcgcgccccggggcagccg ctgcggctgg ggacgctgaa gtcgaacatc 1140 gggcacacgc aggccgcttcgggtgtcgcc ggtgtcatca agatggtgca ggcgctgcgc 1200 cacggggtgc tgccgaagaccctgcacgtg gacgagccga cggaccaggt cgactggtcg 1260 gccggttcgg tcgagctgctcaccgaggcc gtggactggc cggagcggcc gggccggctc 1320 cgccgggcgg gcgtctccgcgttcggcgtg ggcgggacga acgcgcacgt cgtcctggag 1380 gaggccccgg cggtcgaggagtcccctgcc gtcgagccgc cggccggtgg cggcgtggtg 1440 ccgtggccgg tgtccgcgaagacctcggcc gcactggacg cccagatcgg gcagctcgcc 1500 gcatacgcgg aagaccgcacggacgtggat ccggcggtgg ccgcccgcgc cctggtcgac 1560 agccgtacgg cgatggagcaccgcgcggtc gcggtcggcg acagccggga ggcactgcgg 1620 gacgccctgc ggatgccggaaggactggta cggggcacgg tcaccgatcc gggccgggtg 1680 gcgttcgtct tccccggccagggcacgcag tgggccggca tgggcgccga actcctcgac 1740 agctcacccg aattcgccgccgccatggcc gaatgcgaga ccgcactctc cccgtacgtc 1800 gactggtctc tcgaagccgtcgtccgacag gctcccagcg caccgacact cgaccgcgtc 1860 gacgtcgtcc agcccgtcaccttcgccgtc atggtctccc tcgccaaggt ctggcagcac 1920 cacggcatca cccccgaggccgtcatcggc cactcccagg gcgagatcgc cgccgcgtac 1980 gtcgccggtg ccctcaccctcgacgacgcc gctcgtgtcg tgaccctccg cagcaagtcc 2040 atcgccgccc acctcgccggcaagggcggc atgatctccc tcgccctcag cgaggaagcc 2100 acccggcagc gcatcgagaacctccacgga ctgtcgatcg ccgccgtcaa cgggcctacc 2160 gccaccgtgg tttcgggcgaccccacccag atccaagaac ttgctcaggc gtgtgaggcc 2220 gacggcatcc gcgcacggatcatccccgtc gactacgcct cccacagcgc ccacgtcgag 2280 accatcgaga acgaactcgccgacgtcctg gcggggttgt ccccccagac accccaggtc 2340 cccttcttct ccaccctcgaaggcacctgg atcaccgaac ccgccctcga cggcggctac 2400 tggtaccgca acctccgccatcgtgtgggc ttcgccccgg ccgtcgagac cctcgccacc 2460 gacgaaggct tcacccacttcatcgaggtc agcgcccacc ccgtcctcac catgaccctc 2520 cccgacaagg tcaccggcctggccaccctc cgacgcgagg acggcggaca gcaccgcctc 2580 accacctccc ttgccgaggcctgggccaac ggcctcgccc tcgactgggc ctccctcctg 2640 cccgccacgg gcgccctcagccccgccgtc cccgacctcc cgacgtacgc cttccagcac 2700 cgctcgtact ggatcagccccgcgggtccc ggcgaggcgc ccgcgcacac cgcttccggg 2760 cgcgaggccg tcgccgagacggggctcgcg tggggcccgg gtgccgagga cctcgacgag 2820 gagggccggc gcagcgccgtactcgcgatg gtgatgcggc aggcggcctc cgtgctccgg 2880 tgcgactcgc ccgaagaggtccccgtcgac cgcccgctgc gggagatcgg cttcgactcg 2940 ctgaccgccg tcgacttccgcaaccgcgtc aaccggctga ccggtctcca gctgccgccc 3000 accgtcgtgt tccagcacccgacgcccgtc gcgctcgccg agcgcatcag cgacgagctg 3060 gccgagcgga actgggccgtcgccgagccg tcggatcacg agcaggcgga ggaggagaag 3120 gccgccgctc cggcgggggcccgctccggg gccgacaccg gcgccggcgc cgggatgttc 3180 cgcgccctgt tccggcaggccgtggaggac gaccggtacg gcgagttcct cgacgtcctc 3240 gccgaagcct ccgcgttccgcccgcagttc gcctcgcccg aggcctgctc ggagcggctc 3300 gacccggtgc tgctcgccggcggtccgacg gaccgggcgg aaggccgtgc cgttctcgtc 3360 ggctgcaccg gcaccgcggcgaacggcggc ccgcacgagt tcctgcggct cagcacctcc 3420 ttccaggagg agcgggacttcctcgccgta cctctccccg gctacggcac gggtacgggc 3480 accggcacgg ccctcctcccggccgatctc gacaccgcgc tcgacgccca ggcccgggcg 3540 atcctccggg ccgccggggacgccccggtc gtcctgctcg ggcactccgg cggcgccctg 3600 ctcgcgcacg agctggccttccgcctggag cgggcgcacg gcgcgccgcc ggccgggatc 3660 gtcctggtcg acccctatccgccgggccat caggagccca tcgaggtgtg gagcaggcag 3720 ctgggcgagg gcctgttcgcgggcgagctg gagccgatgt ccgatgcgcg gctgctggcc 3780 atgggccggt acgcgcggttcctcgccggc ccgcggccgg gccgcagcag cgcgcccgtg 3840 cttctggtcc gtgcctccgaaccgctgggc gactggcagg aggagcgggg cgactggcgt 3900 gcccactggg accttccgcacaccgtcgcg gacgtgccgg gcgaccactt cacgatgatg 3960 cgggaccacg cgccggccgtcgccgaggcc gtcctctcct ggctcgacgc catcgagggc 4020 atcgaggggg cgggcaagtg a4041 37 1346 PRT Streptomyces venezuelae 37 Met Thr Ser Ser Asn Glu GlnLeu Val Asp Ala Leu Arg Ala Ser Leu 1 5 10 15 Lys Glu Asn Glu Glu LeuArg Lys Glu Ser Arg Arg Arg Ala Asp Arg 20 25 30 Arg Gln Glu Pro Met AlaIle Val Gly Met Ser Cys Arg Phe Ala Gly 35 40 45 Gly Ile Arg Ser Pro GluAsp Leu Trp Asp Ala Val Ala Ala Gly Lys 50 55 60 Asp Leu Val Ser Glu ValPro Glu Glu Arg Gly Trp Asp Ile Asp Ser 65 70 75 80 Leu Tyr Asp Pro ValPro Gly Arg Lys Gly Thr Thr Tyr Val Arg Asn 85 90 95 Ala Ala Phe Leu AspAsp Ala Ala Gly Phe Asp Ala Ala Phe Phe Gly 100 105 110 Ile Ser Pro ArgGlu Ala Leu Ala Met Asp Pro Gln Gln Arg Gln Leu 115 120 125 Leu Glu AlaSer Trp Glu Val Phe Glu Arg Ala Gly Ile Asp Pro Ala 130 135 140 Ser ValArg Gly Thr Asp Val Gly Val Tyr Val Gly Cys Gly Tyr Gln 145 150 155 160Asp Tyr Ala Pro Asp Ile Arg Val Ala Pro Glu Gly Thr Gly Gly Tyr 165 170175 Val Val Thr Gly Asn Ser Ser Ala Val Ala Ser Gly Arg Ile Ala Tyr 180185 190 Ser Leu Gly Leu Glu Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser195 200 205 Ser Ser Leu Val Ala Leu His Leu Ala Leu Lys Gly Leu Arg AsnGly 210 215 220 Asp Cys Ser Thr Ala Leu Val Gly Gly Val Ala Val Leu AlaThr Pro 225 230 235 240 Gly Ala Phe Ile Glu Phe Ser Ser Gln Gln Ala MetAla Ala Asp Gly 245 250 255 Arg Thr Lys Gly Phe Ala Ser Ala Ala Asp GlyLeu Ala Trp Gly Glu 260 265 270 Gly Val Ala Val Leu Leu Leu Glu Arg LeuSer Asp Ala Arg Arg Lys 275 280 285 Gly His Arg Val Leu Ala Val Val ArgGly Ser Ala Ile Asn Gln Asp 290 295 300 Gly Ala Ser Asn Gly Leu Thr AlaPro His Gly Pro Ser Gln Gln His 305 310 315 320 Leu Ile Arg Gln Ala LeuAla Asp Ala Arg Leu Thr Ser Ser Asp Val 325 330 335 Asp Val Val Glu GlyHis Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile 340 345 350 Glu Ala Gln AlaLeu Leu Ala Thr Tyr Gly Gln Gly Arg Ala Pro Gly 355 360 365 Gln Pro LeuArg Leu Gly Thr Leu Lys Ser Asn Ile Gly His Thr Gln 370 375 380 Ala AlaSer Gly Val Ala Gly Val Ile Lys Met Val Gln Ala Leu Arg 385 390 395 400His Gly Val Leu Pro Lys Thr Leu His Val Asp Glu Pro Thr Asp Gln 405 410415 Val Asp Trp Ser Ala Gly Ser Val Glu Leu Leu Thr Glu Ala Val Asp 420425 430 Trp Pro Glu Arg Pro Gly Arg Leu Arg Arg Ala Gly Val Ser Ala Phe435 440 445 Gly Val Gly Gly Thr Asn Ala His Val Val Leu Glu Glu Ala ProAla 450 455 460 Val Glu Glu Ser Pro Ala Val Glu Pro Pro Ala Gly Gly GlyVal Val 465 470 475 480 Pro Trp Pro Val Ser Ala Lys Thr Ser Ala Ala LeuAsp Ala Gln Ile 485 490 495 Gly Gln Leu Ala Ala Tyr Ala Glu Asp Arg ThrAsp Val Asp Pro Ala 500 505 510 Val Ala Ala Arg Ala Leu Val Asp Ser ArgThr Ala Met Glu His Arg 515 520 525 Ala Val Ala Val Gly Asp Ser Arg GluAla Leu Arg Asp Ala Leu Arg 530 535 540 Met Pro Glu Gly Leu Val Arg GlyThr Val Thr Asp Pro Gly Arg Val 545 550 555 560 Ala Phe Val Phe Pro GlyGln Gly Thr Gln Trp Ala Gly Met Gly Ala 565 570 575 Glu Leu Leu Asp SerSer Pro Glu Phe Ala Ala Ala Met Ala Glu Cys 580 585 590 Glu Thr Ala LeuSer Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val 595 600 605 Arg Gln AlaPro Ser Ala Pro Thr Leu Asp Arg Val Asp Val Val Gln 610 615 620 Pro ValThr Phe Ala Val Met Val Ser Leu Ala Lys Val Trp Gln His 625 630 635 640His Gly Ile Thr Pro Glu Ala Val Ile Gly His Ser Gln Gly Glu Ile 645 650655 Ala Ala Ala Tyr Val Ala Gly Ala Leu Thr Leu Asp Asp Ala Ala Arg 660665 670 Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys675 680 685 Gly Gly Met Ile Ser Leu Ala Leu Ser Glu Glu Ala Thr Arg GlnArg 690 695 700 Ile Glu Asn Leu His Gly Leu Ser Ile Ala Ala Val Asn GlyPro Thr 705 710 715 720 Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile GlnGlu Leu Ala Gln 725 730 735 Ala Cys Glu Ala Asp Gly Ile Arg Ala Arg IleIle Pro Val Asp Tyr 740 745 750 Ala Ser His Ser Ala His Val Glu Thr IleGlu Asn Glu Leu Ala Asp 755 760 765 Val Leu Ala Gly Leu Ser Pro Gln ThrPro Gln Val Pro Phe Phe Ser 770 775 780 Thr Leu Glu Gly Thr Trp Ile ThrGlu Pro Ala Leu Asp Gly Gly Tyr 785 790 795 800 Trp Tyr Arg Asn Leu ArgHis Arg Val Gly Phe Ala Pro Ala Val Glu 805 810 815 Thr Leu Ala Thr AspGlu Gly Phe Thr His Phe Ile Glu Val Ser Ala 820 825 830 His Pro Val LeuThr Met Thr Leu Pro Asp Lys Val Thr Gly Leu Ala 835 840 845 Thr Leu ArgArg Glu Asp Gly Gly Gln His Arg Leu Thr Thr Ser Leu 850 855 860 Ala GluAla Trp Ala Asn Gly Leu Ala Leu Asp Trp Ala Ser Leu Leu 865 870 875 880Pro Ala Thr Gly Ala Leu Ser Pro Ala Val Pro Asp Leu Pro Thr Tyr 885 890895 Ala Phe Gln His Arg Ser Tyr Trp Ile Ser Pro Ala Gly Pro Gly Glu 900905 910 Ala Pro Ala His Thr Ala Ser Gly Arg Glu Ala Val Ala Glu Thr Gly915 920 925 Leu Ala Trp Gly Pro Gly Ala Glu Asp Leu Asp Glu Glu Gly ArgArg 930 935 940 Ser Ala Val Leu Ala Met Val Met Arg Gln Ala Ala Ser ValLeu Arg 945 950 955 960 Cys Asp Ser Pro Glu Glu Val Pro Val Asp Arg ProLeu Arg Glu Ile 965 970 975 Gly Phe Asp Ser Leu Thr Ala Val Asp Phe ArgAsn Arg Val Asn Arg 980 985 990 Leu Thr Gly Leu Gln Leu Pro Pro Thr ValVal Phe Gln His Pro Thr 995 1000 1005 Pro Val Ala Leu Ala Glu Arg IleSer Asp Glu Leu Ala Glu Arg Asn 1010 1015 1020 Trp Ala Val Ala Glu ProSer Asp His Glu Gln Ala Glu Glu Glu Lys 1025 1030 1035 1040 Ala Ala AlaPro Ala Gly Ala Arg Ser Gly Ala Asp Thr Gly Ala Gly 1045 1050 1055 AlaGly Met Phe Arg Ala Leu Phe Arg Gln Ala Val Glu Asp Asp Arg 1060 10651070 Tyr Gly Glu Phe Leu Asp Val Leu Ala Glu Ala Ser Ala Phe Arg Pro1075 1080 1085 Gln Phe Ala Ser Pro Glu Ala Cys Ser Glu Arg Leu Asp ProVal Leu 1090 1095 1100 Leu Ala Gly Gly Pro Thr Asp Arg Ala Glu Gly ArgAla Val Leu Val 1105 1110 1115 1120 Gly Cys Thr Gly Thr Ala Ala Asn GlyGly Pro His Glu Phe Leu Arg 1125 1130 1135 Leu Ser Thr Ser Phe Gln GluGlu Arg Asp Phe Leu Ala Val Pro Leu 1140 1145 1150 Pro Gly Tyr Gly ThrGly Thr Gly Thr Gly Thr Ala Leu Leu Pro Ala 1155 1160 1165 Asp Leu AspThr Ala Leu Asp Ala Gln Ala Arg Ala Ile Leu Arg Ala 1170 1175 1180 AlaGly Asp Ala Pro Val Val Leu Leu Gly His Ser Gly Gly Ala Leu 1185 11901195 1200 Leu Ala His Glu Leu Ala Phe Arg Leu Glu Arg Ala His Gly AlaPro 1205 1210 1215 Pro Ala Gly Ile Val Leu Val Asp Pro Tyr Pro Pro GlyHis Gln Glu 1220 1225 1230 Pro Ile Glu Val Trp Ser Arg Gln Leu Gly GluGly Leu Phe Ala Gly 1235 1240 1245 Glu Leu Glu Pro Met Ser Asp Ala ArgLeu Leu Ala Met Gly Arg Tyr 1250 1255 1260 Ala Arg Phe Leu Ala Gly ProArg Pro Gly Arg Ser Ser Ala Pro Val 1265 1270 1275 1280 Leu Leu Val ArgAla Ser Glu Pro Leu Gly Asp Trp Gln Glu Glu Arg 1285 1290 1295 Gly AspTrp Arg Ala His Trp Asp Leu Pro His Thr Val Ala Asp Val 1300 1305 1310Pro Gly Asp His Phe Thr Met Met Arg Asp His Ala Pro Ala Val Ala 13151320 1325 Glu Ala Val Leu Ser Trp Leu Asp Ala Ile Glu Gly Ile Glu GlyAla 1330 1335 1340 Gly Lys 1345 38 1251 DNA Streptomyces venezuelae 38gtgcgccgta cccagcaggg aacgaccgct tctcccccgg tactcgacct cggggccctg 60gggcaggatt tcgcggccga tccgtatccg acgtacgcga gactgcgtgc cgagggtccg 120gcccaccggg tgcgcacccc cgagggggac gaggtgtggc tggtcgtcgg ctacgaccgg 180gcgcgggcgg tcctcgccga tccccggttc agcaaggact ggcgcaactc cacgactccc 240ctgaccgagg ccgaggccgc gctcaaccac aacatgctgg agtccgaccc gccgcggcac 300acccggctgc gcaagctggt ggcccgtgag ttcaccatgc gccgggtcga gttgctgcgg 360ccccgggtcc aggagatcgt cgacgggctc gtggacgcca tgctggcggc gcccgacggc 420cgcgccgatc tgatggagtc cctggcctgg ccgctgccga tcaccgtgat ctccgaactc 480ctcggcgtgc ccgagccgga ccgcgccgcc ttccgcgtct ggaccgacgc cttcgtcttc 540ccggacgatc ccgcccaggc ccagaccgcc atggccgaga tgagcggcta tctctcccgg 600ctcatcgact ccaagcgcgg gcaggacggc gaggacctgc tcagcgcgct cgtgcggacc 660agcgacgagg acggctcccg gctgacctcc gaggagctgc tcggtatggc ccacatcctg 720ctcgtcgcgg ggcacgagac cacggtcaat ctgatcgcca acggcatgta cgcgctgctc 780tcgcaccccg accagctggc cgccctgcgg gccgacatga cgctcttgga cggcgcggtg 840gaggagatgt tgcgctacga gggcccggtg gaatccgcga cctaccgctt cccggtcgag 900cccgtcgacc tggacggcac ggtcatcccg gccggtgaca cggtcctcgt cgtcctggcc 960gacgcccacc gcacccccga gcgcttcccg gacccgcacc gcttcgacat ccgccgggac 1020accgccggcc atctcgcctt cggccacggc atccacttct gcatcggcgc ccccttggcc 1080cggttggagg cccggatcgc cgtccgcgcc cttctcgaac gctgcccgga cctcgccctg 1140gacgtctccc ccggcgaact cgtgtggtat ccgaacccga tgattcgcgg gctcaaggcc 1200ctgccgatcc gctggcggcg aggacgggag gcgggccgcc gtaccggttg a 1251 39 416 PRTStreptomyces venezuelae 39 Met Arg Arg Thr Gln Gln Gly Thr Thr Ala SerPro Pro Val Leu Asp 1 5 10 15 Leu Gly Ala Leu Gly Gln Asp Phe Ala AlaAsp Pro Tyr Pro Thr Tyr 20 25 30 Ala Arg Leu Arg Ala Glu Gly Pro Ala HisArg Val Arg Thr Pro Glu 35 40 45 Gly Asp Glu Val Trp Leu Val Val Gly TyrAsp Arg Ala Arg Ala Val 50 55 60 Leu Ala Asp Pro Arg Phe Ser Lys Asp TrpArg Asn Ser Thr Thr Pro 65 70 75 80 Leu Thr Glu Ala Glu Ala Ala Leu AsnHis Asn Met Leu Glu Ser Asp 85 90 95 Pro Pro Arg His Thr Arg Leu Arg LysLeu Val Ala Arg Glu Phe Thr 100 105 110 Met Arg Arg Val Glu Leu Leu ArgPro Arg Val Gln Glu Ile Val Asp 115 120 125 Gly Leu Val Asp Ala Met LeuAla Ala Pro Asp Gly Arg Ala Asp Leu 130 135 140 Met Glu Ser Leu Ala TrpPro Leu Pro Ile Thr Val Ile Ser Glu Leu 145 150 155 160 Leu Gly Val ProGlu Pro Asp Arg Ala Ala Phe Arg Val Trp Thr Asp 165 170 175 Ala Phe ValPhe Pro Asp Asp Pro Ala Gln Ala Gln Thr Ala Met Ala 180 185 190 Glu MetSer Gly Tyr Leu Ser Arg Leu Ile Asp Ser Lys Arg Gly Gln 195 200 205 AspGly Glu Asp Leu Leu Ser Ala Leu Val Arg Thr Ser Asp Glu Asp 210 215 220Gly Ser Arg Leu Thr Ser Glu Glu Leu Leu Gly Met Ala His Ile Leu 225 230235 240 Leu Val Ala Gly His Glu Thr Thr Val Asn Leu Ile Ala Asn Gly Met245 250 255 Tyr Ala Leu Leu Ser His Pro Asp Gln Leu Ala Ala Leu Arg AlaAsp 260 265 270 Met Thr Leu Leu Asp Gly Ala Val Glu Glu Met Leu Arg TyrGlu Gly 275 280 285 Pro Val Glu Ser Ala Thr Tyr Arg Phe Pro Val Glu ProVal Asp Leu 290 295 300 Asp Gly Thr Val Ile Pro Ala Gly Asp Thr Val LeuVal Val Leu Ala 305 310 315 320 Asp Ala His Arg Thr Pro Glu Arg Phe ProAsp Pro His Arg Phe Asp 325 330 335 Ile Arg Arg Asp Thr Ala Gly His LeuAla Phe Gly His Gly Ile His 340 345 350 Phe Cys Ile Gly Ala Pro Leu AlaArg Leu Glu Ala Arg Ile Ala Val 355 360 365 Arg Ala Leu Leu Glu Arg CysPro Asp Leu Ala Leu Asp Val Ser Pro 370 375 380 Gly Glu Leu Val Trp TyrPro Asn Pro Met Ile Arg Gly Leu Lys Ala 385 390 395 400 Leu Pro Ile ArgTrp Arg Arg Gly Arg Glu Ala Gly Arg Arg Thr Gly 405 410 415 40 2787 DNAStreptomyces venezuelae 40 atgaatctgg tggaacgcga cggggagata gcccatctcagggccgttct tgacgcatcc 60 gccgcaggtg acgggacgct cttactcgtc tccggaccggccggcagcgg gaagacggag 120 ctgctgcggt cgctccgccg gctggccgcc gagcgggagacccccgtctg gtcggtccgg 180 gcgctgccgg gtgaccgcga catccccctg ggcgtcctctgccagttact ccgcagcgcc 240 gaacaacacg gtgccgacac ctccgccgtc cgcgacctgctggacgccgc ctcgcggcgg 300 gccggaaacc tcacctcccc cgccgacgcg ccgctccgcgtcgacgagac acaccgcctg 360 cacgactggc tgctctccgt ctcccgccgc accccgttcctcgtcgccgt cgacgacctg 420 acccacgccg acaccgcgtc cctgaggttc ctcctgtactgcgccgccca ccacgaccag 480 ggcggcatcg gcttcgtcat gaccgagcgg gcctcgcagcgcgccggata ccgggtgttc 540 cgcgccgagc tgctccgcca gccgcactgc cgcaacatgtggctctccgg gcttcccccc 600 agcggggtac gccagttact cgcccactac tacggccccgaggccgccga gcggcgggcc 660 cccgcgtacc acgcgacgac cggcgggaac ccgctgctcctgcgggcgct gacccaggac 720 cggcaggcct cccacaccac cctcggcgcg gccggcggcgacgagcccgt ccacggcgac 780 gccttcgccc aggccgtcct cgactgcctg caccgcagcgccgagggcac actggagacc 840 gcccgctggc tcgcggtcct cgaacagtcc gacccgctcctggtggagcg gctcacggga 900 acgaccgccg ccgccgtcga gcgccacatc caggagctcgccgccatcgg cctcctggac 960 gaggacggca ccctgggaca gcccgcgatc cgcgaggccgccctccagga cctgccggcc 1020 ggcgagcgca ccgaactgca ccggcgcgcc gcggagcagctgcaccggga cggcgccgac 1080 gaggacaccg tggcccgcca cctgctggtc ggcggcgcccccgacgctcc ctgggcgctg 1140 cccctgctcg aacggggcgc gcagcaggcc ctgttcgacgaccgactcga cgacgccttc 1200 cggatcctcg agttcgccgt gcggtcgagc accgacaacacccagctggc ccgcctcgcc 1260 ccacacctgg tcgcggcctc ctggcggatg aacccgcacatgacgacccg ggccctcgca 1320 ctcttcgacc ggctcctgag cggtgaactg ccgcccagccacccggtcat ggccctgatc 1380 cgctgcctcg tctggtacgg gcggctgccc gaggccgccgacgcgctgtc ccggctgcgg 1440 cccagctccg acaacgatgc cttggagctg tcgctcacccggatgtggct cgcggcgctg 1500 tgcccgccgc tcctggagtc cctgccggcc acgccggagccggagcgggg tcccgtcccc 1560 gtacggctcg cgccgcggac gaccgcgctc caggcccaggccggcgtctt ccagcggggc 1620 ccggacaacg cctcggtcgc gcaggccgaa cagatcctgcagggctgccg gctgtcggag 1680 gagacgtacg aggccctgga gacggccctc ttggtcctcgtccacgccga ccggctcgac 1740 cgggcgctgt tctggtcgga cgccctgctc gccgaggccgtggagcggcg gtcgctcggc 1800 tgggaggcgg tcttcgccgc gacccgggcg atgatcgcgatccgctgcgg cgacctcccg 1860 acggcgcggg agcgggccga gctggcgctc tcccacgcggcgccggagag ctggggcctc 1920 gccgtgggca tgcccctctc cgcgctgctg ctcgcctgcacggaggccgg cgagtacgaa 1980 caggcggagc gggtcctgcg gcagccggtg ccggacgcgatgttcgactc gcggcacggc 2040 atggagtaca tgcacgcccg gggccgctac tggctggcgacgggccggct gcacgcggcg 2100 ctgggcgagt tcatgctctg cggggagatc ctgggcagctggaacctcga ccagccctcg 2160 atcgtgccct ggcggacctc cgccgccgag gtgtacctgcggctcggcaa ccgccagaag 2220 gccagggcgc tggccgaggc ccagctcgcc ctggtgcggcccgggcgctc ccgcacccgg 2280 ggtctcaccc tgcgggtcct ggcggcggcg gtggacggccagcaggcgga gcggctgcac 2340 gccgaggcgg tcgacatgct gcacgacagc ggcgaccggctcgaacacgc ccgcgcgctc 2400 gccgggatga gccgccacca gcaggcccag ggggacaactaccgggcgag gatgacggcg 2460 cggctcgccg gcgacatggc gtgggcctgc ggcgcgtacccgctggccga ggagatcgtg 2520 ccgggccgcg gcggccgccg ggcgaaggcg gtgagcacggagctggaact gccgggcggc 2580 ccggacgtcg gcctgctctc ggaggccgaa cgccgggtggcggccctggc agcccgagga 2640 ttgacgaacc gccagatagc gcgccggctc tgcgtcaccgcgagcacggt cgaacagcac 2700 ctgacgcgcg tctaccgcaa actgaacgtg acccgccgagcagacctccc gatcagcctc 2760 gcccaggaca agtccgtcac ggcctga 2787 41 928 PRTStreptomyces venezuelae 41 Met Asn Leu Val Glu Arg Asp Gly Glu Ile AlaHis Leu Arg Ala Val 1 5 10 15 Leu Asp Ala Ser Ala Ala Gly Asp Gly ThrLeu Leu Leu Val Ser Gly 20 25 30 Pro Ala Gly Ser Gly Lys Thr Glu Leu LeuArg Ser Leu Arg Arg Leu 35 40 45 Ala Ala Glu Arg Glu Thr Pro Val Trp SerVal Arg Ala Leu Pro Gly 50 55 60 Asp Arg Asp Ile Pro Leu Gly Val Leu CysGln Leu Leu Arg Ser Ala 65 70 75 80 Glu Gln His Gly Ala Asp Thr Ser AlaVal Arg Asp Leu Leu Asp Ala 85 90 95 Ala Ser Arg Arg Ala Gly Asn Leu ThrSer Pro Ala Asp Ala Pro Leu 100 105 110 Arg Val Asp Glu Thr His Arg LeuHis Asp Trp Leu Leu Ser Val Ser 115 120 125 Arg Arg Thr Pro Phe Leu ValAla Val Asp Asp Leu Thr His Ala Asp 130 135 140 Thr Ala Ser Leu Arg PheLeu Leu Tyr Cys Ala Ala His His Asp Gln 145 150 155 160 Gly Gly Ile GlyPhe Val Met Thr Glu Arg Ala Ser Gln Arg Ala Gly 165 170 175 Tyr Arg ValPhe Arg Ala Glu Leu Leu Arg Gln Pro His Cys Arg Asn 180 185 190 Met TrpLeu Ser Gly Leu Pro Pro Ser Gly Val Arg Gln Leu Leu Ala 195 200 205 HisTyr Tyr Gly Pro Glu Ala Ala Glu Arg Arg Ala Pro Ala Tyr His 210 215 220Ala Thr Thr Gly Gly Asn Pro Leu Leu Leu Arg Ala Leu Thr Gln Asp 225 230235 240 Arg Gln Ala Ser His Thr Thr Leu Gly Ala Ala Gly Gly Asp Glu Pro245 250 255 Val His Gly Asp Ala Phe Ala Gln Ala Val Leu Asp Cys Leu HisArg 260 265 270 Ser Ala Glu Gly Thr Leu Glu Thr Ala Arg Trp Leu Ala ValLeu Glu 275 280 285 Gln Ser Asp Pro Leu Leu Val Glu Arg Leu Thr Gly ThrThr Ala Ala 290 295 300 Ala Val Glu Arg His Ile Gln Glu Leu Ala Ala IleGly Leu Leu Asp 305 310 315 320 Glu Asp Gly Thr Leu Gly Gln Pro Ala IleArg Glu Ala Ala Leu Gln 325 330 335 Asp Leu Pro Ala Gly Glu Arg Thr GluLeu His Arg Arg Ala Ala Glu 340 345 350 Gln Leu His Arg Asp Gly Ala AspGlu Asp Thr Val Ala Arg His Leu 355 360 365 Leu Val Gly Gly Ala Pro AspAla Pro Trp Ala Leu Pro Leu Leu Glu 370 375 380 Arg Gly Ala Gln Gln AlaLeu Phe Asp Asp Arg Leu Asp Asp Ala Phe 385 390 395 400 Arg Ile Leu GluPhe Ala Val Arg Ser Ser Thr Asp Asn Thr Gln Leu 405 410 415 Ala Arg LeuAla Pro His Leu Val Ala Ala Ser Trp Arg Met Asn Pro 420 425 430 His MetThr Thr Arg Ala Leu Ala Leu Phe Asp Arg Leu Leu Ser Gly 435 440 445 GluLeu Pro Pro Ser His Pro Val Met Ala Leu Ile Arg Cys Leu Val 450 455 460Trp Tyr Gly Arg Leu Pro Glu Ala Ala Asp Ala Leu Ser Arg Leu Arg 465 470475 480 Pro Ser Ser Asp Asn Asp Ala Leu Glu Leu Ser Leu Thr Arg Met Trp485 490 495 Leu Ala Ala Leu Cys Pro Pro Leu Leu Glu Ser Leu Pro Ala ThrPro 500 505 510 Glu Pro Glu Arg Gly Pro Val Pro Val Arg Leu Ala Pro ArgThr Thr 515 520 525 Ala Leu Gln Ala Gln Ala Gly Val Phe Gln Arg Gly ProAsp Asn Ala 530 535 540 Ser Val Ala Gln Ala Glu Gln Ile Leu Gln Gly CysArg Leu Ser Glu 545 550 555 560 Glu Thr Tyr Glu Ala Leu Glu Thr Ala LeuLeu Val Leu Val His Ala 565 570 575 Asp Arg Leu Asp Arg Ala Leu Phe TrpSer Asp Ala Leu Leu Ala Glu 580 585 590 Ala Val Glu Arg Arg Ser Leu GlyTrp Glu Ala Val Phe Ala Ala Thr 595 600 605 Arg Ala Met Ile Ala Ile ArgCys Gly Asp Leu Pro Thr Ala Arg Glu 610 615 620 Arg Ala Glu Leu Ala LeuSer His Ala Ala Pro Glu Ser Trp Gly Leu 625 630 635 640 Ala Val Gly MetPro Leu Ser Ala Leu Leu Leu Ala Cys Thr Glu Ala 645 650 655 Gly Glu TyrGlu Gln Ala Glu Arg Val Leu Arg Gln Pro Val Pro Asp 660 665 670 Ala MetPhe Asp Ser Arg His Gly Met Glu Tyr Met His Ala Arg Gly 675 680 685 ArgTyr Trp Leu Ala Thr Gly Arg Leu His Ala Ala Leu Gly Glu Phe 690 695 700Met Leu Cys Gly Glu Ile Leu Gly Ser Trp Asn Leu Asp Gln Pro Ser 705 710715 720 Ile Val Pro Trp Arg Thr Ser Ala Ala Glu Val Tyr Leu Arg Leu Gly725 730 735 Asn Arg Gln Lys Ala Arg Ala Leu Ala Glu Ala Gln Leu Ala LeuVal 740 745 750 Arg Pro Gly Arg Ser Arg Thr Arg Gly Leu Thr Leu Arg ValLeu Ala 755 760 765 Ala Ala Val Asp Gly Gln Gln Ala Glu Arg Leu His AlaGlu Ala Val 770 775 780 Asp Met Leu His Asp Ser Gly Asp Arg Leu Glu HisAla Arg Ala Leu 785 790 795 800 Ala Gly Met Ser Arg His Gln Gln Ala GlnGly Asp Asn Tyr Arg Ala 805 810 815 Arg Met Thr Ala Arg Leu Ala Gly AspMet Ala Trp Ala Cys Gly Ala 820 825 830 Tyr Pro Leu Ala Glu Glu Ile ValPro Gly Arg Gly Gly Arg Arg Ala 835 840 845 Lys Ala Val Ser Thr Glu LeuGlu Leu Pro Gly Gly Pro Asp Val Gly 850 855 860 Leu Leu Ser Glu Ala GluArg Arg Val Ala Ala Leu Ala Ala Arg Gly 865 870 875 880 Leu Thr Asn ArgGln Ile Ala Arg Arg Leu Cys Val Thr Ala Ser Thr 885 890 895 Val Glu GlnHis Leu Thr Arg Val Tyr Arg Lys Leu Asn Val Thr Arg 900 905 910 Arg AlaAsp Leu Pro Ile Ser Leu Ala Gln Asp Lys Ser Val Thr Ala 915 920 925 42846 DNA Streptomyces venezuelae 42 gtgaccgaca gacctctgaa cgtggacagcggactgtgga tccggcgctt ccaccccgcg 60 ccgaacagcg cggtgcggct ggtctgcctgccgcacgccg gcggctccgc cagctacttc 120 ttccgcttct cggaggagct gcacccctccgtcgaggccc tgtcggtgca gtatccgggc 180 cgccaggacc ggcgtgccga gccgtgtctggagagcgtcg aggagctcgc cgagcatgtg 240 gtcgcggcca ccgaaccctg gtggcaggagggccggctgg ccttcttcgg gcacagcctc 300 ggcgcctccg tcgccttcga gacggcccgcatcctggaac agcggcacgg ggtacggccc 360 gagggcctgt acgtctccgg tcggcgcgccccgtcgctgg cgccggaccg gctcgtccac 420 cagctggacg accgggcgtt cctggccgagatccggcggc tcagcggcac cgacgagcgg 480 ttcctccagg acgacgagct gctgcggctggtgctgcccg cgctgcgcag cgactacaag 540 gcggcggaga cgtacctgca ccggccgtccgccaagctca cctgcccggt gatggccctg 600 gccggcgacc gtgacccgaa ggcgccgctgaacgaggtgg ccgagtggcg tcggcacacc 660 agcgggccgt tctgcctccg ggcgtactccggcggccact tctacctcaa cgaccagtgg 720 cacgagatct gcaacgacat ctccgaccacctgctcgtca cccgcggcgc gcccgatgcc 780 cgcgtcgtgc agcccccgac cagccttatcgaaggagcgg cgaagagatg gcagaaccca 840 cggtga 846 43 281 PRT Streptomycesvenezuelae 43 Met Thr Asp Arg Pro Leu Asn Val Asp Ser Gly Leu Trp IleArg Arg 1 5 10 15 Phe His Pro Ala Pro Asn Ser Ala Val Arg Leu Val CysLeu Pro His 20 25 30 Ala Gly Gly Ser Ala Ser Tyr Phe Phe Arg Phe Ser GluGlu Leu His 35 40 45 Pro Ser Val Glu Ala Leu Ser Val Gln Tyr Pro Gly ArgGln Asp Arg 50 55 60 Arg Ala Glu Pro Cys Leu Glu Ser Val Glu Glu Leu AlaGlu His Val 65 70 75 80 Val Ala Ala Thr Glu Pro Trp Trp Gln Glu Gly ArgLeu Ala Phe Phe 85 90 95 Gly His Ser Leu Gly Ala Ser Val Ala Phe Glu ThrAla Arg Ile Leu 100 105 110 Glu Gln Arg His Gly Val Arg Pro Glu Gly LeuTyr Val Ser Gly Arg 115 120 125 Arg Ala Pro Ser Leu Ala Pro Asp Arg LeuVal His Gln Leu Asp Asp 130 135 140 Arg Ala Phe Leu Ala Glu Ile Arg ArgLeu Ser Gly Thr Asp Glu Arg 145 150 155 160 Phe Leu Gln Asp Asp Glu LeuLeu Arg Leu Val Leu Pro Ala Leu Arg 165 170 175 Ser Asp Tyr Lys Ala AlaGlu Thr Tyr Leu His Arg Pro Ser Ala Lys 180 185 190 Leu Thr Cys Pro ValMet Ala Leu Ala Gly Asp Arg Asp Pro Lys Ala 195 200 205 Pro Leu Asn GluVal Ala Glu Trp Arg Arg His Thr Ser Gly Pro Phe 210 215 220 Cys Leu ArgAla Tyr Ser Gly Gly His Phe Tyr Leu Asn Asp Gln Trp 225 230 235 240 HisGlu Ile Cys Asn Asp Ile Ser Asp His Leu Leu Val Thr Arg Gly 245 250 255Ala Pro Asp Ala Arg Val Val Gln Pro Pro Thr Ser Leu Ile Glu Gly 260 265270 Ala Ala Lys Arg Trp Gln Asn Pro Arg 275 280

What is claimed is:
 1. An isolated and purified nucleic acid segmentcomprising a nucleic acid sequence comprising a desosamine biosyntheticgene cluster, a fragment or a biologically active variant thereof,wherein the nucleic acid sequence is not derived from the eryC genecluster of Saccharopolyspora erythraea or Streptomyces antibioticus. 2.The isolated and purified nucleic acid segment of claim 1 comprising SEQID NO:3.
 3. The isolated and purified nucleic acid segment of claim 1which encodes DesI, DesII, DesII, DesIV, DesV, DesVI, DesVII, DesVIII orDesR.
 4. The isolated and purified nucleic acid segment of claim 1 whichis from Streptomyces venezuelae.
 5. An expression cassette comprisingthe nucleic acid segment of claim 1 operably linked to a promoterfunctional in a host cell.
 6. A recombinant bacterial host cell in whichat least a portion of a nucleic acid sequence encoding desosamine isdisrupted so as to result in a decrease or lack of desosamine synthesis,wherein the nucleic acid sequence which is disrupted is not derived fromthe eryC gene cluster of Saccharopolyspora erythraea.
 7. The host cellof claim 6 wherein the nucleic acid sequence which is disrupted encodesDesI, DesII, DesIII, DesIV, DesV, DesVI, DesVII, DesVIII or DesR.
 8. Ahost cell, the genome of which is augmented with the expression cassetteof claim
 5. 9. A product produced by the host cell of claim 6 which isnot produced by the corresponding non-recombinant host cell.
 10. Theproduct of claim 9 which comprises a macrolide.
 11. An isolated andpurified nucleic acid segment comprising a nucleic acid sequencecomprising a macrolide biosynthetic gene cluster encoding methymycin,pikomycin, neomethymycin, narbomycin, or a combination thereof, or abiologically active variant or fragment thereof.
 12. The isolated andpurified nucleic acid segment of claim 11 comprising SEQ ID NO:5. 13.The isolated and purified nucleic acid segment of claim 11 comprising abiologically active variant or fragment of SEQ ID NO:5.
 14. The isolatedand purified nucleic acid segment of claim 11 which encodes PikR1,PikR2, PikAI, PikAII, PikAIII, PikAIV, PikAV, PikC or PikD.
 15. Theisolated and purified nucleic acid segment of claim 11 which is fromStreptomyces venezuelae.
 16. A host cell, the genome of which isaugmented with the nucleic acid segment of claim
 11. 17. An isolated andpurified nucleic acid sequence comprising SEQ ID NO:3, SEQ ID NO:5, afragment thereof, the complement thereto, or which hybridizes thereto.18. An isolated polypeptide encoded by the nucleic acid segment of claim1 or
 11. 19. A recombinant host cell in which a pikAI gene, a pikAIIgene, a pikAIII gene, a pikAIV gene, a pikB gene cluster, a pikAV genecluster, a pikC gene, a pikR1 gene, a pikR2 gene, or a combinationthereof, is disrupted so as to reduce or eliminate production ofmethymycin, neomethymycin, pikromycin, narbomycin, or a combinationthereof.
 20. A macrolide or polyketide produced by the host cell ofclaim 19 which is not produced by the corresponding non-recombinant hostcell.
 21. An isolated and purified DNA molecule comprising a first DNAsegment encoding a first module and a second DNA segment encoding asecond module, wherein the DNA segments together encode a recombinantpolyhydroxyalkanoate monomer synthase, and wherein at least one DNAsegment is derived from the pikA gene cluster of Streptomycesvenezuelae.
 22. A method of providing a polyhydroxyalkanoate monomer,comprising: (a) introducing into a host cell a DNA molecule comprising aDNA segment encoding a recombinant polyhydroxyalkanoate monomer synthaseoperably linked to a promoter functional in the host cell, wherein therecombinant polyhydroxyalkanoate monomer synthase comprises a firstmodule and a second module, and wherein at least one DNA segment isderived from the pikA gene cluster of Streptomyces venezuelae; and (b)expressing the DNA encoding the recombinant polyhydroxyalkanoate monomersynthase in the host cell so as to generate a polyhydroxyalkanoatemonomer.
 23. A recombinant vector comprising one or more modules of apolyketide synthase wherein at least one module is from Streptomycesvenezuelae.
 24. The method of claim 22 wherein the first module encodesa fatty acid synthase.
 25. A method of providing a polyhydroxyalkanoatemonomer, comprising: (a) introducing into a host cell a DNA moleculeencoding a fusion polypeptide, wherein the DNA molecule comprises afirst DNA segment operably linked to a promoter functional in the hostcell and a second DNA segment, wherein at least one DNA segment isderived from the pikA gene cluster of Streptomyces venezuelae; and (b)expressing the DNA in the host cell so as to generate the fusionpolypeptide.
 26. The host cell of claim 16 the native genome of whichdoes not comprise an intact macrolide biosynthetic gene cluster encodingmethymycin, pikomycin, neomethymycin, or narbomycin.